官术网_书友最值得收藏!

Random forests

Random forests is a technique where you construct multiple trees, and then use those trees to learn the classification and regression models, but the results are aggregated from the trees to produce a final result.

Random forests are an ensemble of random, uncorrelated, and fully-grown decision trees. The decision trees used in the random forest model are fully grown, thus, having low bias and high variance. The trees are uncorrelated in nature, which results in a maximum decrease in the variance. By uncorrelated, we imply that each decision tree in the random forest is given a randomly selected subset of features and a randomly selected subset of the dataset for the selected features.

The original paper describing random forests is available at the following link:  https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf .

The random forest technique does not reduce bias and as a result, has a slightly higher bias as compared to the individual trees in the ensemble. 

Random forests were invented by Leo Breiman and have been trademarked by Leo Breiman and Adele Cutler. More information is available at the following link:  https://www.stat.berkeley.edu/~breiman/RandomForests.

Intuitively, in the random forest model, a large number of decision trees are trained on different samples of data, that either fit or overfit. By averaging the individual decision trees, overfitting cancels out. 

Random forests seem similar to bagging, aka bootstrap aggregating, but they are different. In bagging, a random sample with replacement is selected to train every tree in the ensemble. The tree is trained on all the features. In random forests, the features are also sampled randomly, and at each candidate that is split, a subset of features is used to train the model.

For predicting values in case of regression problems, the random forest model averages the predictions from individual decision trees. For predicting classes in case of a classification problem, the random forest model takes a majority vote from the results of individual decision trees.

An interesting explanation of random forests can be found at the following link:  https://machinelearning-blog.com/2018/02/06/the-random-forest-algorithm/
主站蜘蛛池模板: 泗水县| 新昌县| 灵石县| 通河县| 竹北市| 平和县| 葵青区| 金沙县| 洪雅县| 土默特右旗| 特克斯县| 明溪县| 乌恰县| 福安市| 永善县| 丁青县| 青冈县| 公主岭市| 云浮市| 弥勒县| 江华| 星座| 宁武县| 长兴县| 岳西县| 裕民县| 改则县| 洪雅县| 高青县| 乌拉特后旗| 芮城县| 江油市| 英德市| 依安县| 石泉县| 宁津县| 呼伦贝尔市| 苗栗县| 金溪县| 原平市| 阳高县|