官术网_书友最值得收藏!

  • IBM SPSS Modeler Essentials
  • Jesus Salcedo Keith McCormick
  • 196字
  • 2021-07-02 20:04:43

Data Preparation

The Data Preparation phase covers all activities to construct the final dataset (the data that will be fed into the modeling tool(s)) from the initial raw data. Data Preparation is often described as the most labor-intensive phase for the data analyst. It is terribly important that Data Preparation is done well, and a substantial amount of this book is dedicated to it. We cover cleaning, selecting, integrating, and constructing data, in Chapter 5Cleaning and Selecting Data; Chapter 6, Combining Data Files; and Chapter 7, Deriving New Fields, respectively. However, a book dedicated to the basics of data mining can really only start you on your journey when it comes to Data Preparation, since there are so many ways in which you can improve and prepare data. When you are ready for a more advanced treatment of this topic, there are two resources that will go into Data Preparation in much more depth, and both have extensive Modeler software examples: The IBM SPSS Modeler Cookbook (Packt Publishing) and Effective Data Preparation (Cambridge University Press).

The five Data Preparation tasks are:

  • Select data
  • Clean data
  • Construct data
  • Integrate data
  • Format data
主站蜘蛛池模板: 西宁市| 科尔| 顺义区| 台前县| 千阳县| 德安县| 青龙| 丰县| 白河县| 遂川县| 佳木斯市| 富顺县| 兴义市| 巴里| 海兴县| 庆城县| 托克逊县| 托克逊县| 南投市| 久治县| 卓资县| 望江县| 浮梁县| 青浦区| 江都市| 筠连县| 大宁县| 全椒县| 无锡市| 宁阳县| 西丰县| 西贡区| 涡阳县| 康乐县| 舞阳县| 阳城县| 宣汉县| 霍林郭勒市| 苗栗县| 新巴尔虎右旗| 锡林郭勒盟|