官术网_书友最值得收藏!

How it works...

In this recipe, we have learned about the following concepts:

  • Imputing missing values: We have learned that one of the ways to impute the missing values of a variable is by replacing the missing values with the median of the corresponding variable. Other ways to deal with the missing values is by replacing them with the mean value, and also by replacing the missing value with the mean of the variable's value in the rows that are most similar to the row that contains a missing value (this technique is called identifying the K-Nearest Neighbours).
  • Capping the outlier values: We have also learned that one way to cap the outliers is by replacing values that are above the 95th percentile value with the 95th percentile value. The reason we performed this exercise is to ensure that the input variable does not have all the values clustered around a small value (when the variable is scaled by the maximum value, which is an outlier).
  • Scaling dataset: Finally, we scaled the dataset so that it can then be passed to a neural network.
主站蜘蛛池模板: 祁阳县| 平武县| 安陆市| 冷水江市| 巩义市| 大冶市| 洪江市| 江源县| 察雅县| 含山县| 准格尔旗| 南乐县| 宁陵县| 沛县| 安新县| 吉安县| 皋兰县| 科技| 黄龙县| 抚顺县| 宾阳县| 红原县| 深泽县| 嫩江县| 和林格尔县| 平昌县| 孝感市| 资兴市| 慈利县| 临泉县| 灵丘县| 永福县| 南宁市| 安吉县| 茶陵县| 衡山县| 怀仁县| 长沙县| 台山市| 玛沁县| 玉田县|