- Scala for Machine Learning(Second Edition)
- Patrick R. Nicolas
- 165字
- 2021-07-08 10:43:06
Chapter 3. Data Preprocessing
Real-world observations are usually noisy and inconsistent, with missing data. No classification, regression, or clustering model can extract reliable information from data that has not been cleansed, filtered, or analyzed.
Data preprocessing consists of cleaning, filtering, transforming, and normalizing raw observations using statistics in order to correlate features or groups of features, identify trends, model, and filter out noise. The purpose of cleansing raw data is twofold:
- Identify flaws in raw input data
- Provide unsupervised or supervised learning with a clean and reliable dataset
You should not underestimate the power of traditional statistical analysis methods to infer and classify information from textual or unstructured data.
In this chapter, you will learn how to to the following:
- Apply commonly used moving average techniques to detect long-term trends in a time series
- Identify market and sector cycles using the discrete Fourier series
- Leverage the discrete Kalman filter to extract the state of a linear dynamic system from incomplete and noisy observations
推薦閱讀
- Learning RabbitMQ
- Elastic Stack應用寶典
- Easy Web Development with WaveMaker
- Python高效開發實戰:Django、Tornado、Flask、Twisted(第3版)
- 自制編程語言
- Learning Three.js:The JavaScript 3D Library for WebGL
- Building Android UIs with Custom Views
- Tableau 10 Bootcamp
- RESTful Java Web Services(Second Edition)
- Java EE 7 with GlassFish 4 Application Server
- Practical Predictive Analytics
- Java EE 7 Development with WildFly
- Zend Framework 2 Cookbook
- Mastering Linux Kernel Development
- Salt Cookbook