官术网_书友最值得收藏!

Scaling

Values of different features can differ by orders of magnitude. Sometimes this may mean that the larger values dominate the smaller values. This depends on the algorithm we are using. For certain algorithms to work properly we are required to scale the data. There are several common strategies that we can apply:

  • Standardization removes the mean of a feature and divides by the standard deviation. If the feature values are normally distributed, we will get a Gaussian, which is centered around zero with a variance of one.
  • If the feature values are not normally distributed, we can remove the median and divide by the interquartile range. The interquartile range is a range between the first and third quartile (or 25th and 75th percentile).
  • Scaling features to a range is a common choice of range which is a range between zero and one.
主站蜘蛛池模板: 庆阳市| 沈阳市| 广南县| 措勤县| 淮南市| 永宁县| 蓝田县| 乌鲁木齐市| 亳州市| 阳谷县| 库伦旗| 额尔古纳市| 栖霞市| 许昌县| 泉州市| 东源县| 塘沽区| 新竹市| 施秉县| 尉氏县| 昌吉市| 仙游县| 德化县| 革吉县| 辉县市| 永济市| 玉溪市| 邓州市| 屏东市| 罗山县| 彭州市| 墨江| 玉山县| 台湾省| 宁都县| 钟山县| 绵竹市| 安康市| 启东市| 武鸣县| 乐都县|