官术网_书友最值得收藏!

Summary

Decision trees are intuitive algorithms that are capable of performing classification and regression tasks. They allow users to print out their decision rules, which is a plus when communicating the decisions you made to business personnel and non-technical third parties. Additionally, decision trees are easy to configure since they have a limited number of hyperparameters. The two main decisions you need to make when training a decision tree are your splitting criterion and how to control the growth of your tree to have a good balance between overfitting and underfitting. Your understanding of the limitations of the tree's decision boundaries is paramount in deciding whether the algorithm is good enough for the problem at hand.

In this chapter, we looked at how decision trees learn and used them to classify a well-known dataset. We also learned about the different evaluation metrics and how the size of our data affects our confidence in a model's accuracy. We then learned how to deal with the evaluation's uncertainties using different data-splitting strategies. We saw how to tune the algorithm's hyperparameters for a good balance between overfitting and underfitting. Finally, we built on the knowledge we gained to build decision tree regressors and learned how the choice of a splitting criterion affects our resulting predictions.

I hope this chapter has served as a good introduction to scikit-learn and its consistent interface. With this knowledge at hand, we can move on to our next algorithm and see how it compares to this one. In the next chapter, we will learn about linear models. This set of algorithms has its roots back in the 18th century, and it is still one of the most commonly used algorithms today.

主站蜘蛛池模板: 琼海市| 林甸县| 类乌齐县| 府谷县| 赤壁市| 博乐市| 阜康市| 乐平市| 长岭县| 新疆| 广平县| 郴州市| 通渭县| 昭苏县| 巴里| 吉木萨尔县| 灵山县| 金沙县| 金沙县| 东莞市| 烟台市| 凤阳县| 长丰县| 浑源县| 徐州市| 喀喇沁旗| 营口市| 泸西县| 偃师市| 乾安县| 昌黎县| 偏关县| 屏东市| 正安县| 内黄县| 磐石市| 永安市| 金昌市| 荆州市| 赤水市| 哈尔滨市|