官术网_书友最值得收藏!

Summary

Decision trees are intuitive algorithms that are capable of performing classification and regression tasks. They allow users to print out their decision rules, which is a plus when communicating the decisions you made to business personnel and non-technical third parties. Additionally, decision trees are easy to configure since they have a limited number of hyperparameters. The two main decisions you need to make when training a decision tree are your splitting criterion and how to control the growth of your tree to have a good balance between overfitting and underfitting. Your understanding of the limitations of the tree's decision boundaries is paramount in deciding whether the algorithm is good enough for the problem at hand.

In this chapter, we looked at how decision trees learn and used them to classify a well-known dataset. We also learned about the different evaluation metrics and how the size of our data affects our confidence in a model's accuracy. We then learned how to deal with the evaluation's uncertainties using different data-splitting strategies. We saw how to tune the algorithm's hyperparameters for a good balance between overfitting and underfitting. Finally, we built on the knowledge we gained to build decision tree regressors and learned how the choice of a splitting criterion affects our resulting predictions.

I hope this chapter has served as a good introduction to scikit-learn and its consistent interface. With this knowledge at hand, we can move on to our next algorithm and see how it compares to this one. In the next chapter, we will learn about linear models. This set of algorithms has its roots back in the 18th century, and it is still one of the most commonly used algorithms today.

主站蜘蛛池模板: 乌苏市| 麻江县| 平江县| 松江区| 沭阳县| 晋江市| 长春市| 西藏| 宽城| 来宾市| 理塘县| 沾益县| 临泉县| 河津市| 安岳县| 新郑市| 鹤庆县| 乐昌市| 长治县| 万州区| 翼城县| 博兴县| 班戈县| 崇明县| 德化县| 永川市| 南郑县| 苍溪县| 隆林| 千阳县| 大石桥市| 新蔡县| 德钦县| 阿勒泰市| 蒙城县| 柳林县| 阿鲁科尔沁旗| 米脂县| 禹城市| 咸宁市| 荔波县|