官术网_书友最值得收藏!

Chapter 3. Data Preprocessing

Real-world observations are usually noisy and inconsistent, with missing data. No classification, regression, or clustering model can extract reliable information from data that has not been cleansed, filtered, or analyzed.

Data preprocessing consists of cleaning, filtering, transforming, and normalizing raw observations using statistics in order to correlate features or groups of features, identify trends, model, and filter out noise. The purpose of cleansing raw data is twofold:

  • Identify flaws in raw input data
  • Provide unsupervised or supervised learning with a clean and reliable dataset

You should not underestimate the power of traditional statistical analysis methods to infer and classify information from textual or unstructured data.

In this chapter, you will learn how to to the following:

  • Apply commonly used moving average techniques to detect long-term trends in a time series
  • Identify market and sector cycles using the discrete Fourier series
  • Leverage the discrete Kalman filter to extract the state of a linear dynamic system from incomplete and noisy observations
主站蜘蛛池模板: 新干县| 石楼县| 安多县| 大港区| 扶沟县| 西充县| 延长县| 同心县| 甘孜县| 巴中市| 临颍县| 博白县| 庄浪县| 南郑县| 阳春市| 石城县| 沁水县| 桑日县| 郧西县| 深州市| 麻城市| 金寨县| 晋中市| 安化县| 应用必备| 桦南县| 榕江县| 邢台县| 兴文县| 怀柔区| 铜山县| 安顺市| 赞皇县| 宿州市| 双牌县| 定日县| 育儿| 普陀区| 饶河县| 富裕县| 揭西县|