官术网_书友最值得收藏!

Random forests

Random forests is a technique where you construct multiple trees, and then use those trees to learn the classification and regression models, but the results are aggregated from the trees to produce a final result.

Random forests are an ensemble of random, uncorrelated, and fully-grown decision trees. The decision trees used in the random forest model are fully grown, thus, having low bias and high variance. The trees are uncorrelated in nature, which results in a maximum decrease in the variance. By uncorrelated, we imply that each decision tree in the random forest is given a randomly selected subset of features and a randomly selected subset of the dataset for the selected features.

The original paper describing random forests is available at the following link:  https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf .

The random forest technique does not reduce bias and as a result, has a slightly higher bias as compared to the individual trees in the ensemble. 

Random forests were invented by Leo Breiman and have been trademarked by Leo Breiman and Adele Cutler. More information is available at the following link:  https://www.stat.berkeley.edu/~breiman/RandomForests.

Intuitively, in the random forest model, a large number of decision trees are trained on different samples of data, that either fit or overfit. By averaging the individual decision trees, overfitting cancels out. 

Random forests seem similar to bagging, aka bootstrap aggregating, but they are different. In bagging, a random sample with replacement is selected to train every tree in the ensemble. The tree is trained on all the features. In random forests, the features are also sampled randomly, and at each candidate that is split, a subset of features is used to train the model.

For predicting values in case of regression problems, the random forest model averages the predictions from individual decision trees. For predicting classes in case of a classification problem, the random forest model takes a majority vote from the results of individual decision trees.

An interesting explanation of random forests can be found at the following link:  https://machinelearning-blog.com/2018/02/06/the-random-forest-algorithm/
主站蜘蛛池模板: 广州市| 霍邱县| 巴东县| 阜阳市| 丹阳市| 西峡县| 称多县| 长治县| 安泽县| 阿坝县| 平阴县| 安远县| 新沂市| 北安市| 屏东县| 临清市| 永福县| 平江县| 邵武市| 宁蒗| 沂源县| 那曲县| 鹿邑县| 手游| 林甸县| 江北区| 古交市| 江北区| 阳原县| 上犹县| 高要市| 渝中区| 小金县| 新津县| 江津市| 合山市| 车险| 桦南县| 阳曲县| 沂源县| 临沭县|