官术网_书友最值得收藏!

How it works...

In this recipe, we have learned about the following concepts:

  • Imputing missing values: We have learned that one of the ways to impute the missing values of a variable is by replacing the missing values with the median of the corresponding variable. Other ways to deal with the missing values is by replacing them with the mean value, and also by replacing the missing value with the mean of the variable's value in the rows that are most similar to the row that contains a missing value (this technique is called identifying the K-Nearest Neighbours).
  • Capping the outlier values: We have also learned that one way to cap the outliers is by replacing values that are above the 95th percentile value with the 95th percentile value. The reason we performed this exercise is to ensure that the input variable does not have all the values clustered around a small value (when the variable is scaled by the maximum value, which is an outlier).
  • Scaling dataset: Finally, we scaled the dataset so that it can then be passed to a neural network.
主站蜘蛛池模板: 汕头市| 乌拉特中旗| 乐安县| 唐河县| 历史| 邓州市| 海南省| 乌兰县| 新宁县| 楚雄市| 武强县| 神农架林区| 互助| 珲春市| 漳浦县| 平武县| 灌云县| 南陵县| 高尔夫| 大厂| 息烽县| 凤冈县| 集安市| 岫岩| 景德镇市| 日照市| 印江| 湖南省| 海原县| 邓州市| 阿克陶县| 红河县| 松阳县| 修文县| 巍山| 舒城县| 施甸县| 资中县| 扎鲁特旗| 宜昌市| 德格县|