- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 136字
- 2021-07-02 12:41:35
Scaling
Values of different features can differ by orders of magnitude. Sometimes, this may mean that the larger values dominate the smaller values. This depends on the algorithm we're using. For certain algorithms to work properly, we're required to scale the data.
There are following several common strategies that we can apply:
- Standardization removes the mean of a feature and divides by the standard deviation. If the feature values are normally distributed, we'll get a Gaussian, which is centered around zero with a variance of one.
- If the feature values aren't normally distributed, we can remove the median and divide by the interquartile range. The interquartile range is a range between the first and third quartile (or 25th and 75th percentile).
- Scaling features to a range is a common choice of range between zero and one.
推薦閱讀
- 課課通計(jì)算機(jī)原理
- Mastering Mesos
- 基于C語(yǔ)言的程序設(shè)計(jì)
- 網(wǎng)上沖浪
- Mobile DevOps
- Arduino &樂(lè)高創(chuàng)意機(jī)器人制作教程
- Learning C for Arduino
- Learn CloudFormation
- Kubernetes for Developers
- Machine Learning with the Elastic Stack
- 嵌入式操作系統(tǒng)原理及應(yīng)用
- HBase Essentials
- 簡(jiǎn)明學(xué)中文版Photoshop
- 電動(dòng)汽車驅(qū)動(dòng)與控制技術(shù)
- Puppet 3 Beginner’s Guide