官术网_书友最值得收藏!

Random forest

To greatly improve our model's predictive ability, we can produce numerous trees and combine the results. The random forest technique does this by applying two different tricks in model development. The first is the use of bootstrap aggregation, or bagging, as it's called.

In bagging, an individual tree is built on a random sample of the dataset, roughly two-thirds of the total observations (note that the remaining one-third is referred to as out-of-bag (oob)). This is repeated dozens or hundreds of times and the results are averaged. Each of these trees is grown and not pruned based on any error measure, and this means that the variance of each of these individual trees is high. However, by averaging the results, you can reduce the variance without increasing the bias.

The next thing that random forest brings to the table is that concurrently with the random sample of the data—that is, baggingit also takes a random sampling of the input features at each split. In the randomForest package, we'll use the default random number of the predictors that're sampled, which, for classification problems, is the square root of the total predictors, and for regression, is the total number of the predictors divided by three. The number of predictors the algorithm randomly chooses at each split can be changed via the model tuning process.

By doing this random sample of the features at each split and incorporating it into the methodology, you can mitigate the effect of a highly correlated predictor becoming the main driver in all of your bootstrapped trees, preventing you from reducing the variance that you hoped to achieve with bagging. The subsequent averaging of the trees that're less correlated to each other is more generalizable and robust to outliers than if you only performed bagging.

主站蜘蛛池模板: 麻江县| 永城市| 双牌县| 莎车县| 渝北区| 宝鸡市| 城固县| 全椒县| 伊宁县| 大丰市| 咸阳市| 靖江市| 宾阳县| 长阳| 泰安市| 思茅市| 原平市| 扎囊县| 阿图什市| 鄂州市| 平武县| 南平市| 涟源市| 池州市| 大邑县| 上高县| 乾安县| 苏州市| 富蕴县| 海淀区| 共和县| 资源县| 梨树县| 天柱县| 海宁市| 伽师县| 扶风县| 翁源县| 突泉县| 黎平县| 建德市|