官术网_书友最值得收藏!

Random forests

Random forests is a technique where you construct multiple trees, and then use those trees to learn the classification and regression models, but the results are aggregated from the trees to produce a final result.

Random forests are an ensemble of random, uncorrelated, and fully-grown decision trees. The decision trees used in the random forest model are fully grown, thus, having low bias and high variance. The trees are uncorrelated in nature, which results in a maximum decrease in the variance. By uncorrelated, we imply that each decision tree in the random forest is given a randomly selected subset of features and a randomly selected subset of the dataset for the selected features.

The original paper describing random forests is available at the following link:  https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf .

The random forest technique does not reduce bias and as a result, has a slightly higher bias as compared to the individual trees in the ensemble. 

Random forests were invented by Leo Breiman and have been trademarked by Leo Breiman and Adele Cutler. More information is available at the following link:  https://www.stat.berkeley.edu/~breiman/RandomForests.

Intuitively, in the random forest model, a large number of decision trees are trained on different samples of data, that either fit or overfit. By averaging the individual decision trees, overfitting cancels out. 

Random forests seem similar to bagging, aka bootstrap aggregating, but they are different. In bagging, a random sample with replacement is selected to train every tree in the ensemble. The tree is trained on all the features. In random forests, the features are also sampled randomly, and at each candidate that is split, a subset of features is used to train the model.

For predicting values in case of regression problems, the random forest model averages the predictions from individual decision trees. For predicting classes in case of a classification problem, the random forest model takes a majority vote from the results of individual decision trees.

An interesting explanation of random forests can be found at the following link:  https://machinelearning-blog.com/2018/02/06/the-random-forest-algorithm/
主站蜘蛛池模板: 通许县| 时尚| 昌图县| 扶风县| 岳池县| 象州县| 新乡市| 加查县| 新宁县| 唐山市| 天峨县| 盖州市| 永登县| 海宁市| 积石山| 社旗县| 丰原市| 平定县| 诸城市| 韩城市| 台南市| 公主岭市| 泰来县| 桃江县| 甘肃省| 阜新市| 桦川县| 兰溪市| 廉江市| 秀山| 泊头市| 漳州市| 台山市| 龙海市| 四子王旗| 浙江省| 蚌埠市| 嘉义市| 榆社县| 两当县| 淳安县|