官术网_书友最值得收藏!

How it works...

In this recipe, we have learned about the following concepts:

  • Imputing missing values: We have learned that one of the ways to impute the missing values of a variable is by replacing the missing values with the median of the corresponding variable. Other ways to deal with the missing values is by replacing them with the mean value, and also by replacing the missing value with the mean of the variable's value in the rows that are most similar to the row that contains a missing value (this technique is called identifying the K-Nearest Neighbours).
  • Capping the outlier values: We have also learned that one way to cap the outliers is by replacing values that are above the 95th percentile value with the 95th percentile value. The reason we performed this exercise is to ensure that the input variable does not have all the values clustered around a small value (when the variable is scaled by the maximum value, which is an outlier).
  • Scaling dataset: Finally, we scaled the dataset so that it can then be passed to a neural network.
主站蜘蛛池模板: 全椒县| 湘乡市| 嘉黎县| 南涧| 庄浪县| 房山区| 洪江市| 商丘市| 盘锦市| 礼泉县| 界首市| 台北县| 眉山市| 兴山县| 中阳县| 洱源县| 微山县| 太谷县| 岑巩县| 金门县| 武定县| 呼玛县| 八宿县| 霍林郭勒市| 崇信县| 贵州省| 拜泉县| 营山县| 鲜城| 南乐县| 嵩明县| 阆中市| 天柱县| 石门县| 彩票| 泰州市| 灵山县| 鲁甸县| 巩留县| 湾仔区| 韶关市|