- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 143字
- 2021-07-02 22:57:19
Scaling
Values of different features can differ by orders of magnitude. Sometimes this may mean that the larger values dominate the smaller values. This depends on the algorithm we are using. For certain algorithms to work properly we are required to scale the data. There are several common strategies that we can apply:
- Standardization removes the mean of a feature and divides by the standard deviation. If the feature values are normally distributed, we will get a Gaussian, which is centered around zero with a variance of one.
- If the feature values are not normally distributed, we can remove the median and divide by the interquartile range. The interquartile range is a range between the first and third quartile (or 25th and 75th percentile).
- Scaling features to a range is a common choice of range which is a range between zero and one.
推薦閱讀
- ExtGWT Rich Internet Application Cookbook
- DBA攻堅指南:左手Oracle,右手MySQL
- Python科學計算(第2版)
- Reporting with Visual Studio and Crystal Reports
- Power Up Your PowToon Studio Project
- 簡單高效LATEX
- Arduino開發實戰指南:LabVIEW卷
- PostgreSQL技術內幕:事務處理深度探索
- 編譯系統透視:圖解編譯原理
- 精通Linux(第2版)
- Photoshop智能手機APP界面設計
- 人人都能開發RPA機器人:UiPath從入門到實戰
- Clojure Web Development Essentials
- Qt 5.12實戰
- C語言編程魔法書:基于C11標準