官术网_书友最值得收藏!

Understanding supervised learning with decision trees

The decision tree algorithm uses a tree-like model of decisions. Its name is derived from the graphical representation of the cascading process that partitions the records. The algorithm chooses the input variables that better split the dataset into subsets that are more pure in terms of the target variable, ideally a subset that contains only one value of this variable. Decision trees are some of the most widely used and easy to understand classification algorithms. 

The outcome of the tree algorithm calculation is a set of simple rules that explain which values or intervals of the input values split the original data better. The fact that the results and the path followed to get to them can be clearly shown gives decision trees an advantage over other algorithms. Explainability is a serious problem for some machine learning and artificial intelligence systems – which are mostly used as black boxes – and is a study subject in itself.

In complex problems, we need to decide when to stop the tree development. A large number of features can lead to a very large and complex tree, so the number of branches and the length of the tree are usually limited by the user. 

Entropy is a very important concept in decision trees and the way of quantifying the purity of each subsample. It measures the amount of information contained in each leaf of the tree. The lower the entropy, the larger the amount of information. Zero entropy means that a subset contains only one value of the target variable, while a value of one represents a subset that contains the same amount of both values. This concept will be explained later with examples.

Entropy is an indicator of how messy your data is.

Using the entropy that's calculated in every step, the algorithm chooses the best variable to split the data and recursively repeats the same procedure. The user can decide how to stop the calculation, either when all subsets have an entropy of zero, when there are no more features to split by, or a minimum entropy level.

The input features that are best suited for use in a decision tree are the categorical ones. In case of a continuous, numerical variable, it should be first converted into categories by dividing it into ranges; for example, A > 0.5 would be A1 and A ≤ 0.5 would be A2.

Let's look at an example that explains the concept of the decision tree algorithm.

主站蜘蛛池模板: 璧山县| 紫金县| 三门县| 花垣县| 尤溪县| 涟水县| 汉川市| 三明市| 丹凤县| 德化县| 贵阳市| 蒙城县| 宜兰市| 谷城县| 四川省| 永吉县| 化州市| 苗栗县| 明水县| 遵义市| 尤溪县| 夏津县| 临颍县| 大英县| 漾濞| 宜都市| 蓝田县| 壶关县| 聊城市| 寿光市| 沈丘县| 平潭县| 瑞安市| 铜梁县| 奉新县| 晋宁县| 区。| 天峻县| 南华县| 青铜峡市| 攀枝花市|