官术网_书友最值得收藏!

  • Deep Learning By Example
  • Ahmed Menshawy
  • 139字
  • 2021-06-24 18:52:45

Assigning an average value

This is also one of the common approaches because of its simplicity. In the case of a numerical feature, you can just replace the missing values with the mean or median. You can also use this approach in the case of categorical variables by assigning the mode (the value that has the highest occurrence) to the missing values.

The following code assigns the median of the non-missing values of the Fare feature to the missing values:

# handling the missing values by replacing it with the median fare
df_titanic_data['Fare'][np.isnan(df_titanic_data['Fare'])] = df_titanic_data['Fare'].median()

Or, you can use the following code to find the value that has the highest occurrence in the Embarked feature and assign it to the missing values:

# replacing the missing values with the most common value in the variable
df_titanic_data.Embarked[df_titanic_data.Embarked.isnull()] = df_titanic_data.Embarked.dropna().mode().values
主站蜘蛛池模板: 利辛县| 景宁| 宽甸| 木兰县| 武邑县| 房产| 黄陵县| 罗江县| 淅川县| 盘锦市| 尼木县| 渝中区| 韶山市| 长沙市| 三江| 胶州市| 怀安县| 皋兰县| 襄汾县| 台江县| 高邮市| 同仁县| 韶山市| 江北区| 洛扎县| 乌拉特中旗| 历史| 英超| 哈尔滨市| 岑溪市| 石景山区| 黄梅县| 平南县| 阳城县| 香河县| 中卫市| 万州区| 彭州市| 腾冲县| 平昌县| 宁都县|