官术网_书友最值得收藏!

Decision trees

Decision trees are a class of supervised learning algorithms like a flow chart that consists of a sequence of nodes, where the values for a sample are used to make a decision on the next node to go to.  

The following example gives a very good idea of how decision trees are a class of supervised learning algorithms:

As with most classification algorithms, there are two stages to using them:

  • The first stage is the training stage, where a tree is built using training data. While the nearest neighbor algorithm from the previous chapter did not have a training phase, it is needed for decision trees. In this way, the nearest neighbor algorithm is a lazy learner, only doing any work when it needs to make a prediction. In contrast, decision trees, like most classification methods, are eager learners, undertaking work at the training stage and therefore needing to do less in the predicting stage.
  • The second stage is the predicting stage, where the trained tree is used to predict the classification of new samples. Using the previous example tree, a data point of ["is raining", "very windy"] would be classed as bad weather.

There are many algorithms for creating decision trees. Many of these algorithms are iterative. They start at the base node and decide the best feature to use for the first decision, then go to each node and choose the next best feature, and so on. This process is stopped at a certain point when it is decided that nothing more can be gained from extending the tree further.

The scikit-learn package implements the Classification and Regression Trees (CART) algorithm as its default dDecision tree class, which can use both categorical and continuous features.

主站蜘蛛池模板: 库车县| 扎囊县| 娄烦县| 铜陵市| 东海县| 溆浦县| 库伦旗| 神农架林区| 大兴区| 兴义市| 白水县| 五家渠市| 陈巴尔虎旗| 乡城县| 嘉峪关市| 武威市| 南陵县| 响水县| 育儿| 乌审旗| 乃东县| 青岛市| 蒙阴县| 东莞市| 武平县| 汤原县| 阳城县| 宜宾市| 抚远县| 若尔盖县| 晋江市| 新晃| 威宁| 阆中市| 康平县| 南岸区| 尚义县| 若尔盖县| 崇左市| 西吉县| 华安县|