官术网_书友最值得收藏!

Entropy of the target variable

The definition of entropy when looking at a single attribute is as follows:

Here, c is the total number of possible values of the feature f, pi is the probability of each value, and log2(pi) is the base two logarithm of the same probability. The calculation details are as follows:

  1. We need to count the number of Yes and No decisions in the dataset. In our simple example, they can be counted by hand, but if the dataset is larger, we can use Excel functions:

COUNTIF(F2:F15;"Yes") and COUNTIF(F2:F15;"No")

We then get the calculation that Yes = 9 and No = 5.

  1. When applying the entropy formula to the target variable, we get the following:

Here, the probabilities are calculated as the number of Yes (9) or No (5) over the total number (14).

This calculation can also be easily performed in the Excel sheet using I3/(I3+J3)*LOG(I3/(I3+J3);2)-J3/(I3+J3)*LOG(J3/(I3+J3);2) with I3=9 and J3=5.
主站蜘蛛池模板: 广河县| 都兰县| 鹿邑县| 桂林市| 敦化市| 安西县| 罗平县| 修武县| 平定县| 广水市| 南澳县| 柳林县| 惠来县| 临海市| 托克逊县| 万山特区| 南乐县| 平度市| 宜宾县| 类乌齐县| 靖远县| 海淀区| 罗甸县| 金门县| 富锦市| 乐都县| 上犹县| 博乐市| 凌云县| 克山县| 汨罗市| 五大连池市| 庆安县| 政和县| 富平县| 方山县| 湖南省| 云霄县| 桃园市| 静宁县| 深水埗区|