官术网_书友最值得收藏!

Decision trees

Decision trees are a class of supervised learning algorithms like a flow chart that consists of a sequence of nodes, where the values for a sample are used to make a decision on the next node to go to.  

The following example gives a very good idea of how decision trees are a class of supervised learning algorithms:

As with most classification algorithms, there are two stages to using them:

  • The first stage is the training stage, where a tree is built using training data. While the nearest neighbor algorithm from the previous chapter did not have a training phase, it is needed for decision trees. In this way, the nearest neighbor algorithm is a lazy learner, only doing any work when it needs to make a prediction. In contrast, decision trees, like most classification methods, are eager learners, undertaking work at the training stage and therefore needing to do less in the predicting stage.
  • The second stage is the predicting stage, where the trained tree is used to predict the classification of new samples. Using the previous example tree, a data point of ["is raining", "very windy"] would be classed as bad weather.

There are many algorithms for creating decision trees. Many of these algorithms are iterative. They start at the base node and decide the best feature to use for the first decision, then go to each node and choose the next best feature, and so on. This process is stopped at a certain point when it is decided that nothing more can be gained from extending the tree further.

The scikit-learn package implements the Classification and Regression Trees (CART) algorithm as its default dDecision tree class, which can use both categorical and continuous features.

主站蜘蛛池模板: 新宾| 长乐市| 兴仁县| 彝良县| 祁门县| 武隆县| 遵义市| 车致| 龙泉市| 江西省| 孟津县| 苍溪县| 平塘县| 佳木斯市| 铜山县| 松溪县| 惠州市| 班戈县| 米易县| 彭阳县| 报价| 饶阳县| 舞阳县| 宜宾市| 封开县| 垫江县| 桐梓县| 淮滨县| 临海市| 文成县| 常熟市| 南开区| 武定县| 大余县| 长阳| 洛阳市| 阳朔县| 贵州省| 宁都县| 平遥县| 旬阳县|