官术网_书友最值得收藏!

Decision trees

Decision trees are a class of supervised learning algorithms like a flow chart that consists of a sequence of nodes, where the values for a sample are used to make a decision on the next node to go to.  

The following example gives a very good idea of how decision trees are a class of supervised learning algorithms:

As with most classification algorithms, there are two stages to using them:

  • The first stage is the training stage, where a tree is built using training data. While the nearest neighbor algorithm from the previous chapter did not have a training phase, it is needed for decision trees. In this way, the nearest neighbor algorithm is a lazy learner, only doing any work when it needs to make a prediction. In contrast, decision trees, like most classification methods, are eager learners, undertaking work at the training stage and therefore needing to do less in the predicting stage.
  • The second stage is the predicting stage, where the trained tree is used to predict the classification of new samples. Using the previous example tree, a data point of ["is raining", "very windy"] would be classed as bad weather.

There are many algorithms for creating decision trees. Many of these algorithms are iterative. They start at the base node and decide the best feature to use for the first decision, then go to each node and choose the next best feature, and so on. This process is stopped at a certain point when it is decided that nothing more can be gained from extending the tree further.

The scikit-learn package implements the Classification and Regression Trees (CART) algorithm as its default dDecision tree class, which can use both categorical and continuous features.

主站蜘蛛池模板: 云林县| 涡阳县| 荆州市| 搜索| 兴宁市| 苍南县| 明水县| 宜昌市| 南江县| 宜春市| 永登县| 洛扎县| 郎溪县| 景德镇市| 三亚市| 景谷| 乌鲁木齐市| 五家渠市| 海兴县| 阳城县| 吉木萨尔县| 甘孜县| 龙泉市| 江都市| 哈密市| 乌兰县| 禹城市| 从江县| 东乡| 灌南县| 施秉县| 伊金霍洛旗| 余江县| 吴忠市| 五家渠市| 辽中县| 乌鲁木齐县| 梁平县| 和硕县| 凭祥市| 庐江县|