官术网_书友最值得收藏!

What this book covers

Chapter 1, Introduction to Data Mining, introduces the notion of data mining and the CRISP-DM process model. You will learn what data mining is, why you would want to use it, and some of the types of questions you could answer with data mining.

Chapter 2, The Basics of Using IBM SPSS Modeler, introduces the Modeler graphic user interface. You will learn where different components of the program are located, how to work with nodes and create streams, and how to use various help options.

Chapter 3, Importing Data into Modeler, introduces the general data structure that is used in Modeler. You will learn how to read and display data, and you will be introduced to the concepts of measurement level and field roles.

Chapter 4, Data Quality and Exploration, focuses on the Data Understanding phase of data mining. We will spend some time exploring our data and assessing its quality. This chapter introduces the Data Audit node, which is used to explore and assess data. You will see this node's options and learn how to look over its results. You will also be introduced to the concept of missing data and will be shown ways to address it.

Chapter 5, Cleaning and Selecting Data, introduces the Data Preparation phase, so we can fix some of the problems that were previously identified during the Data Understanding phase. You will be shown how to select the appropriate cases for analysis, how to sort cases to get a better feel for the data, how to identify and remove duplicate cases, and how to reclassify categorical values to address various types of issues.

Chapter 6, Combining Data Files, continues with the Data Preparation phase of data mining by filtering fields and combining different types of data files.

Chapter 7Deriving New Fields, introduces the Derive node. The Derive node can perform different types of calculations so that users can extract more information from the data. These additional fields can then provide insights that may not have been apparent. In this chapter, you will learn that the Derive node can create fields as formulas, flags, nominals, or conditionals.

Chapter 8, Looking for Relationships between Fields, focuses on discovering simple relationships between an outcome variable and a predictor variable. You will learn how to use several statistical and graphing nodes to determine which fields are related to each other. Specifically, you will learn to use the Distribution and Matrix nodes to assess the relationship between two categorical variables. You will also learn how to use the Histogram and Means nodes to identify the relationship between categorical and continuous fields. Finally, you will be introduced to the Plot and Statistics nodes to investigate relationships between continuous fields.

Chapter 9, Introduction to Modeling Options in IBM SPSS Modeler, introduces the different types of models available in Modeler and then provides an overview of the predictive models. Readers will also be introduced to the Partition node so that they can create Training and Testing datasets.

Chapter 10, Decision Tree Models, introduces readers to the decision tree theory. It then provides an overview of the CHAID model so that readers become familiar with the theory, dialogs, and results of this model.

Chapter 11, Model Assessment and Scoring, speaks about assessing the results once a model has been built. This chapter discusses different ways of assessing the results of a model. Readers will also learn how to score new data and how to export these predictions.

主站蜘蛛池模板: 五常市| 镇赉县| 河津市| 临桂县| 荔浦县| 南陵县| 辽源市| 吉木萨尔县| 固阳县| 额敏县| 濮阳市| 开原市| 临潭县| 龙游县| 仁怀市| 德化县| 阿拉尔市| 临沭县| 称多县| 赫章县| 措勤县| 淳安县| 秭归县| 五指山市| 安庆市| 犍为县| 杨浦区| 邹平县| 易门县| 偏关县| 尚志市| 仙居县| 陆河县| 石柱| 兴化市| 托克逊县| 青州市| 宜春市| 浠水县| 井研县| 郓城县|