- Machine Learning for Developers
- Rodolfo Bonnin
- 169字
- 2021-07-02 15:46:52
Imputation of missing data
When dealing with not-so-perfect or incomplete datasets, a missing register may not add value to the model in itself, but all the other elements of the row could be useful to the model. This is especially true when the model has a high percentage of incomplete values, so no row can be discarded.
The main question in this process is "how do you interpret a missing value?" There are many ways, and they usually depend on the problem itself.
A very naive approach could be set the value to zero, supposing that the mean of the data distribution is 0. An improved step could be to relate the missing data with the surrounding content, assigning the average of the whole column, or an interval of n elements of the same columns. Another option is to use the column's median or most frequent value.
Additionally, there are more advanced techniques, such as robust methods and even k-nearest neighbors, that we won't cover in this book.
- Java 9 Programming Blueprints
- Java技術(shù)手冊(cè)(原書第7版)
- The HTML and CSS Workshop
- Building Serverless Applications with Python
- Python忍者秘籍
- 搞定J2EE:Struts+Spring+Hibernate整合詳解與典型案例
- Scala編程(第5版)
- Mastering AWS Security
- 深入解析Java編譯器:源碼剖析與實(shí)例詳解
- Sails.js Essentials
- Python一行流:像專家一樣寫代碼
- IBM RUP參考與認(rèn)證指南
- PHP動(dòng)態(tài)網(wǎng)站開發(fā)實(shí)踐教程
- 大話代碼架構(gòu):項(xiàng)目實(shí)戰(zhàn)版
- Spring Microservices