官术网_书友最值得收藏!

Entropy of the target variable

The definition of entropy when looking at a single attribute is as follows:

Here, c is the total number of possible values of the feature f, pi is the probability of each value, and log2(pi) is the base two logarithm of the same probability. The calculation details are as follows:

  1. We need to count the number of Yes and No decisions in the dataset. In our simple example, they can be counted by hand, but if the dataset is larger, we can use Excel functions:

COUNTIF(F2:F15;"Yes") and COUNTIF(F2:F15;"No")

We then get the calculation that Yes = 9 and No = 5.

  1. When applying the entropy formula to the target variable, we get the following:

Here, the probabilities are calculated as the number of Yes (9) or No (5) over the total number (14).

This calculation can also be easily performed in the Excel sheet using I3/(I3+J3)*LOG(I3/(I3+J3);2)-J3/(I3+J3)*LOG(J3/(I3+J3);2) with I3=9 and J3=5.
主站蜘蛛池模板: 绍兴市| 龙泉市| 东明县| 罗江县| 安溪县| 宽甸| 彭阳县| 汝阳县| 本溪市| 曲松县| 淮滨县| 香港 | 思茅市| 平山县| 中山市| 孟连| 黄梅县| 甘谷县| 苏尼特左旗| 盘锦市| 扶绥县| 济源市| 南皮县| 安福县| 永仁县| 藁城市| 长宁区| 平阴县| 宜宾县| 金昌市| 青铜峡市| 洛扎县| 舟山市| 怀柔区| 康乐县| 象州县| 遵义县| 衡阳县| 平遥县| 扎囊县| 增城市|