官术网_书友最值得收藏!

How do ensembles work?

The randomness inherent in random forests may make it seem like we are leaving the results of the algorithm up to chance. However, we apply the benefits of averaging to nearly randomly built decision trees, resulting in an algorithm that reduces the variance of the result.

Variance is the error introduced by variations in the training dataset on the algorithm. Algorithms with a high variance (such as decision trees) can be greatly affected by variations to the training dataset. This results in models that have the problem of overfitting. In contrast, bias is the error introduced by assumptions in the algorithm rather than anything to do with the dataset, that is, if we had an algorithm that presumed that all features would be normally distributed, then our algorithm may have a high error if the features were not.

Negative impacts from bias can be reduced by analyzing the data to see if the classifier's data model matches that of the actual data.

To use an extreme example, a classifier that always predicts true, regardless of the input, has a very high bias. A classifier that always predicts randomly would have a very high variance. Each classifier has a high degree of error but of a different nature.

By averaging a large number of decision trees, this variance is greatly reduced. This results, at least normally, in a model with a higher overall accuracy and better predictive power. The trade-offs are an increase in time and an increase in the bias of the algorithm.

In general, ensembles work on the assumption that errors in prediction are effectively random and that those errors are quite different from one classifier to another. By averaging the results across many models, these random errors are canceled out—leaving the true prediction. We will see many more ensembles in action throughout the rest of the book.

主站蜘蛛池模板: 佛冈县| 大关县| 平顶山市| 肥西县| 通州市| 张家川| 高雄市| 衡阳市| 张家港市| 上杭县| 抚顺县| 霍林郭勒市| 永定县| 龙陵县| 甘洛县| 英山县| 瓦房店市| 哈密市| 靖安县| 华阴市| 农安县| 上杭县| 伊金霍洛旗| 肥城市| 亚东县| 安平县| 盘山县| 沙洋县| 道孚县| 土默特右旗| 冕宁县| 紫阳县| 肇州县| 建阳市| 酒泉市| 关岭| 山东| 定结县| 乐都县| 耿马| 吐鲁番市|