官术网_书友最值得收藏!

Understanding supervised learning with decision trees

The decision tree algorithm uses a tree-like model of decisions. Its name is derived from the graphical representation of the cascading process that partitions the records. The algorithm chooses the input variables that better split the dataset into subsets that are more pure in terms of the target variable, ideally a subset that contains only one value of this variable. Decision trees are some of the most widely used and easy to understand classification algorithms. 

The outcome of the tree algorithm calculation is a set of simple rules that explain which values or intervals of the input values split the original data better. The fact that the results and the path followed to get to them can be clearly shown gives decision trees an advantage over other algorithms. Explainability is a serious problem for some machine learning and artificial intelligence systems – which are mostly used as black boxes – and is a study subject in itself.

In complex problems, we need to decide when to stop the tree development. A large number of features can lead to a very large and complex tree, so the number of branches and the length of the tree are usually limited by the user. 

Entropy is a very important concept in decision trees and the way of quantifying the purity of each subsample. It measures the amount of information contained in each leaf of the tree. The lower the entropy, the larger the amount of information. Zero entropy means that a subset contains only one value of the target variable, while a value of one represents a subset that contains the same amount of both values. This concept will be explained later with examples.

Entropy is an indicator of how messy your data is.

Using the entropy that's calculated in every step, the algorithm chooses the best variable to split the data and recursively repeats the same procedure. The user can decide how to stop the calculation, either when all subsets have an entropy of zero, when there are no more features to split by, or a minimum entropy level.

The input features that are best suited for use in a decision tree are the categorical ones. In case of a continuous, numerical variable, it should be first converted into categories by dividing it into ranges; for example, A > 0.5 would be A1 and A ≤ 0.5 would be A2.

Let's look at an example that explains the concept of the decision tree algorithm.

主站蜘蛛池模板: 洮南市| 大邑县| 卢龙县| 邹平县| 尉氏县| 海门市| 朝阳市| 金秀| 西畴县| 新宁县| 新河县| 璧山县| 芦山县| 大田县| 兰坪| 大竹县| 宁德市| 祁阳县| 白山市| 大渡口区| 井冈山市| 芮城县| 儋州市| 辛集市| 新巴尔虎左旗| 库车县| 庆安县| 林口县| 平顶山市| 凤翔县| 若羌县| 通州区| 西昌市| 湘阴县| 龙海市| 牡丹江市| 内黄县| 平阴县| 葫芦岛市| 乌兰察布市| 嵊泗县|