官术网_书友最值得收藏!

Understanding supervised learning with decision trees

The decision tree algorithm uses a tree-like model of decisions. Its name is derived from the graphical representation of the cascading process that partitions the records. The algorithm chooses the input variables that better split the dataset into subsets that are more pure in terms of the target variable, ideally a subset that contains only one value of this variable. Decision trees are some of the most widely used and easy to understand classification algorithms. 

The outcome of the tree algorithm calculation is a set of simple rules that explain which values or intervals of the input values split the original data better. The fact that the results and the path followed to get to them can be clearly shown gives decision trees an advantage over other algorithms. Explainability is a serious problem for some machine learning and artificial intelligence systems – which are mostly used as black boxes – and is a study subject in itself.

In complex problems, we need to decide when to stop the tree development. A large number of features can lead to a very large and complex tree, so the number of branches and the length of the tree are usually limited by the user. 

Entropy is a very important concept in decision trees and the way of quantifying the purity of each subsample. It measures the amount of information contained in each leaf of the tree. The lower the entropy, the larger the amount of information. Zero entropy means that a subset contains only one value of the target variable, while a value of one represents a subset that contains the same amount of both values. This concept will be explained later with examples.

Entropy is an indicator of how messy your data is.

Using the entropy that's calculated in every step, the algorithm chooses the best variable to split the data and recursively repeats the same procedure. The user can decide how to stop the calculation, either when all subsets have an entropy of zero, when there are no more features to split by, or a minimum entropy level.

The input features that are best suited for use in a decision tree are the categorical ones. In case of a continuous, numerical variable, it should be first converted into categories by dividing it into ranges; for example, A > 0.5 would be A1 and A ≤ 0.5 would be A2.

Let's look at an example that explains the concept of the decision tree algorithm.

主站蜘蛛池模板: 霍林郭勒市| 萨嘎县| 扶余县| 湾仔区| 沙洋县| 庆云县| 廉江市| 乌兰浩特市| 周口市| 常德市| 鹤壁市| 台山市| 鄂州市| 图们市| 任丘市| 普格县| 兴业县| 荥阳市| 锡林浩特市| 辉南县| 凌源市| 宽城| 普安县| 连城县| 东城区| 英德市| 景谷| 蕉岭县| 沽源县| 昌乐县| 乐昌市| 潜山县| 博罗县| 科技| 金华市| 临武县| 手机| 额尔古纳市| 团风县| 元朗区| 曲靖市|