官术网_书友最值得收藏!

  • IBM SPSS Modeler Essentials
  • Jesus Salcedo Keith McCormick
  • 196字
  • 2021-07-02 20:04:43

Data Preparation

The Data Preparation phase covers all activities to construct the final dataset (the data that will be fed into the modeling tool(s)) from the initial raw data. Data Preparation is often described as the most labor-intensive phase for the data analyst. It is terribly important that Data Preparation is done well, and a substantial amount of this book is dedicated to it. We cover cleaning, selecting, integrating, and constructing data, in Chapter 5Cleaning and Selecting Data; Chapter 6, Combining Data Files; and Chapter 7, Deriving New Fields, respectively. However, a book dedicated to the basics of data mining can really only start you on your journey when it comes to Data Preparation, since there are so many ways in which you can improve and prepare data. When you are ready for a more advanced treatment of this topic, there are two resources that will go into Data Preparation in much more depth, and both have extensive Modeler software examples: The IBM SPSS Modeler Cookbook (Packt Publishing) and Effective Data Preparation (Cambridge University Press).

The five Data Preparation tasks are:

  • Select data
  • Clean data
  • Construct data
  • Integrate data
  • Format data
主站蜘蛛池模板: 安溪县| 阳城县| 汽车| 西峡县| 县级市| 孝感市| 厦门市| 延寿县| 陈巴尔虎旗| 定襄县| 巴马| 黄山市| 杭锦后旗| 九寨沟县| 澄迈县| 长海县| 格尔木市| 罗山县| 吴堡县| 静安区| 衡阳市| 华亭县| 郯城县| 岫岩| 青州市| 水城县| 富平县| 四平市| 井冈山市| 安多县| 桦川县| 云阳县| 怀来县| 荥经县| 泽库县| 霍城县| 土默特左旗| 陕西省| 北辰区| 山东| 手游|