- Hands-On Machine Learning with Microsoft Excel 2019
- Julio Cesar Rodriguez Martino
- 160字
- 2021-06-24 15:11:02
Entropy of the target variable
The definition of entropy when looking at a single attribute is as follows:

Here, c is the total number of possible values of the feature f, pi is the probability of each value, and log2(pi) is the base two logarithm of the same probability. The calculation details are as follows:
- We need to count the number of Yes and No decisions in the dataset. In our simple example, they can be counted by hand, but if the dataset is larger, we can use Excel functions:
COUNTIF(F2:F15;"Yes") and COUNTIF(F2:F15;"No")
We then get the calculation that Yes = 9 and No = 5.
- When applying the entropy formula to the target variable, we get the following:

Here, the probabilities are calculated as the number of Yes (9) or No (5) over the total number (14).
This calculation can also be easily performed in the Excel sheet using I3/(I3+J3)*LOG(I3/(I3+J3);2)-J3/(I3+J3)*LOG(J3/(I3+J3);2) with I3=9 and J3=5.
推薦閱讀
- Hands-On Data Structures and Algorithms with Rust
- LibGDX Game Development Essentials
- 大規模數據分析和建模:基于Spark與R
- 軟件成本度量國家標準實施指南:理論、方法與實踐
- 大數據分析:數據倉庫項目實戰
- 數據庫技術及應用
- 大數據數學基礎(R語言描述)
- Hands-On System Programming with C++
- AndEngine for Android Game Development Cookbook
- 大數據測試技術:數據采集、分析與測試實踐(在線實驗+在線自測)
- Microsoft Dynamics NAV 2015 Professional Reporting
- 數據挖掘與機器學習-WEKA應用技術與實踐(第二版)
- 標簽類目體系:面向業務的數據資產設計方法論
- C# 7 and .NET Core 2.0 High Performance
- 數據時代的品牌智造