官术网_书友最值得收藏!

Summary

Decision trees are intuitive algorithms that are capable of performing classification and regression tasks. They allow users to print out their decision rules, which is a plus when communicating the decisions you made to business personnel and non-technical third parties. Additionally, decision trees are easy to configure since they have a limited number of hyperparameters. The two main decisions you need to make when training a decision tree are your splitting criterion and how to control the growth of your tree to have a good balance between overfitting and underfitting. Your understanding of the limitations of the tree's decision boundaries is paramount in deciding whether the algorithm is good enough for the problem at hand.

In this chapter, we looked at how decision trees learn and used them to classify a well-known dataset. We also learned about the different evaluation metrics and how the size of our data affects our confidence in a model's accuracy. We then learned how to deal with the evaluation's uncertainties using different data-splitting strategies. We saw how to tune the algorithm's hyperparameters for a good balance between overfitting and underfitting. Finally, we built on the knowledge we gained to build decision tree regressors and learned how the choice of a splitting criterion affects our resulting predictions.

I hope this chapter has served as a good introduction to scikit-learn and its consistent interface. With this knowledge at hand, we can move on to our next algorithm and see how it compares to this one. In the next chapter, we will learn about linear models. This set of algorithms has its roots back in the 18th century, and it is still one of the most commonly used algorithms today.

主站蜘蛛池模板: 华阴市| 浪卡子县| 张家界市| 浪卡子县| 凉山| 大埔区| 上虞市| 三明市| 厦门市| 仪征市| 阿拉善盟| 武安市| 华池县| 六盘水市| 津南区| 雷波县| 遂昌县| 巧家县| 宁安市| 龙里县| 望奎县| 琼结县| 长垣县| 越西县| 潞西市| 神池县| 大丰市| 保亭| 图们市| 北京市| 双辽市| 嘉定区| 池州市| 封开县| 泰和县| 桐梓县| 霍城县| 忻州市| 黔西| 彭山县| 白河县|