- Advanced Machine Learning with R
- Cory Lesmeister Dr. Sunil Kumar Chinnamgari
- 303字
- 2021-06-24 14:24:43
Random forest
To greatly improve our model's predictive ability, we can produce numerous trees and combine the results. The random forest technique does this by applying two different tricks in model development. The first is the use of bootstrap aggregation, or bagging, as it's called.
In bagging, an individual tree is built on a random sample of the dataset, roughly two-thirds of the total observations (note that the remaining one-third is referred to as out-of-bag (oob)). This is repeated dozens or hundreds of times and the results are averaged. Each of these trees is grown and not pruned based on any error measure, and this means that the variance of each of these individual trees is high. However, by averaging the results, you can reduce the variance without increasing the bias.
The next thing that random forest brings to the table is that concurrently with the random sample of the data—that is, bagging—it also takes a random sampling of the input features at each split. In the randomForest package, we'll use the default random number of the predictors that're sampled, which, for classification problems, is the square root of the total predictors, and for regression, is the total number of the predictors divided by three. The number of predictors the algorithm randomly chooses at each split can be changed via the model tuning process.
By doing this random sample of the features at each split and incorporating it into the methodology, you can mitigate the effect of a highly correlated predictor becoming the main driver in all of your bootstrapped trees, preventing you from reducing the variance that you hoped to achieve with bagging. The subsequent averaging of the trees that're less correlated to each other is more generalizable and robust to outliers than if you only performed bagging.
- Arduino入門基礎(chǔ)教程
- Augmented Reality with Kinect
- 計(jì)算機(jī)組裝·維護(hù)與故障排除
- BeagleBone By Example
- 計(jì)算機(jī)組裝與維修技術(shù)
- 計(jì)算機(jī)組裝與維護(hù)(第3版)
- 微軟互聯(lián)網(wǎng)信息服務(wù)(IIS)最佳實(shí)踐 (微軟技術(shù)開發(fā)者叢書)
- STM32嵌入式技術(shù)應(yīng)用開發(fā)全案例實(shí)踐
- Building 3D Models with modo 701
- 筆記本電腦應(yīng)用技巧
- 筆記本電腦維修實(shí)踐教程
- Managing Data and Media in Microsoft Silverlight 4:A mashup of chapters from Packt's bestselling Silverlight books
- 電腦組裝與維護(hù)即時(shí)通
- 圖解計(jì)算機(jī)組裝與維護(hù)
- FPGA實(shí)驗(yàn)實(shí)訓(xùn)教程