官术网_书友最值得收藏!

Chapter 3. Data Preprocessing

Real-world observations are usually noisy and inconsistent, with missing data. No classification, regression, or clustering model can extract reliable information from data that has not been cleansed, filtered, or analyzed.

Data preprocessing consists of cleaning, filtering, transforming, and normalizing raw observations using statistics in order to correlate features or groups of features, identify trends, model, and filter out noise. The purpose of cleansing raw data is twofold:

  • Identify flaws in raw input data
  • Provide unsupervised or supervised learning with a clean and reliable dataset

You should not underestimate the power of traditional statistical analysis methods to infer and classify information from textual or unstructured data.

In this chapter, you will learn how to to the following:

  • Apply commonly used moving average techniques to detect long-term trends in a time series
  • Identify market and sector cycles using the discrete Fourier series
  • Leverage the discrete Kalman filter to extract the state of a linear dynamic system from incomplete and noisy observations
主站蜘蛛池模板: 苗栗县| 安龙县| 广元市| 临沧市| 南江县| 兰州市| 宣化县| 宁安市| 永新县| 庐江县| 东海县| 金乡县| 民权县| 吴川市| 拜城县| 华池县| 德昌县| 上思县| 遂平县| 察雅县| 阳朔县| 丰县| 汉沽区| 会理县| 睢宁县| 宁夏| 铜陵市| 宿州市| 左云县| 定日县| 辽宁省| 纳雍县| 武威市| 鄂伦春自治旗| 宜兰市| 台前县| 子长县| 阆中市| 永昌县| 托克逊县| 温州市|