- Machine Learning for Developers
- Rodolfo Bonnin
- 148字
- 2021-07-02 15:46:52
Dataset preprocessing
When we first dive into data science, a common mistake is expecting all the data to be very polished and with good characteristics from the very beginning. Alas, that is not the case for a very considerable percentage of cases, for many reasons such as null data, sensor errors that cause outliers and NAN, faulty registers, instrument-induced bias, and all kinds of defects that lead to poor model fitting and that must be eradicated.
The two key processes in this stage are data normalization and feature scaling. This process consists of applying simple transformations called affine that map the current unbalanced data into a more manageable shape, maintaining its integrity but providing better stochastic properties and improving the future applied model. The common goal of the standardization techniques is to bring the data distribution closer to a normal distribution, with the following techniques:
- SOA實踐
- 大學計算機應用基礎實踐教程
- 軟件項目管理(第2版)
- 前端跨界開發指南:JavaScript工具庫原理解析與實戰
- JIRA 7 Administration Cookbook(Second Edition)
- 數據結構簡明教程(第2版)微課版
- 0 bug:C/C++商用工程之道
- Learning jQuery(Fourth Edition)
- 軟件測試教程
- Programming with CodeIgniterMVC
- 計算機應用技能實訓教程
- Appcelerator Titanium:Patterns and Best Practices
- 微前端設計與實現
- SAP Web Dynpro for ABAP開發技術詳解:基礎應用
- Spring Boot從入門到實戰