官术网_书友最值得收藏!

What this book covers

Chapter 1, Introduction to Data Mining, introduces the notion of data mining and the CRISP-DM process model. You will learn what data mining is, why you would want to use it, and some of the types of questions you could answer with data mining.

Chapter 2, The Basics of Using IBM SPSS Modeler, introduces the Modeler graphic user interface. You will learn where different components of the program are located, how to work with nodes and create streams, and how to use various help options.

Chapter 3, Importing Data into Modeler, introduces the general data structure that is used in Modeler. You will learn how to read and display data, and you will be introduced to the concepts of measurement level and field roles.

Chapter 4, Data Quality and Exploration, focuses on the Data Understanding phase of data mining. We will spend some time exploring our data and assessing its quality. This chapter introduces the Data Audit node, which is used to explore and assess data. You will see this node's options and learn how to look over its results. You will also be introduced to the concept of missing data and will be shown ways to address it.

Chapter 5, Cleaning and Selecting Data, introduces the Data Preparation phase, so we can fix some of the problems that were previously identified during the Data Understanding phase. You will be shown how to select the appropriate cases for analysis, how to sort cases to get a better feel for the data, how to identify and remove duplicate cases, and how to reclassify categorical values to address various types of issues.

Chapter 6, Combining Data Files, continues with the Data Preparation phase of data mining by filtering fields and combining different types of data files.

Chapter 7Deriving New Fields, introduces the Derive node. The Derive node can perform different types of calculations so that users can extract more information from the data. These additional fields can then provide insights that may not have been apparent. In this chapter, you will learn that the Derive node can create fields as formulas, flags, nominals, or conditionals.

Chapter 8, Looking for Relationships between Fields, focuses on discovering simple relationships between an outcome variable and a predictor variable. You will learn how to use several statistical and graphing nodes to determine which fields are related to each other. Specifically, you will learn to use the Distribution and Matrix nodes to assess the relationship between two categorical variables. You will also learn how to use the Histogram and Means nodes to identify the relationship between categorical and continuous fields. Finally, you will be introduced to the Plot and Statistics nodes to investigate relationships between continuous fields.

Chapter 9, Introduction to Modeling Options in IBM SPSS Modeler, introduces the different types of models available in Modeler and then provides an overview of the predictive models. Readers will also be introduced to the Partition node so that they can create Training and Testing datasets.

Chapter 10, Decision Tree Models, introduces readers to the decision tree theory. It then provides an overview of the CHAID model so that readers become familiar with the theory, dialogs, and results of this model.

Chapter 11, Model Assessment and Scoring, speaks about assessing the results once a model has been built. This chapter discusses different ways of assessing the results of a model. Readers will also learn how to score new data and how to export these predictions.

主站蜘蛛池模板: 四川省| 准格尔旗| 邯郸县| 静海县| 安仁县| 邵武市| 长乐市| 日照市| 衡东县| 大渡口区| 金平| 奇台县| 霍城县| 北宁市| 田阳县| 荥阳市| 确山县| 仪征市| 广灵县| 图片| 友谊县| 理塘县| 得荣县| 郯城县| 海盐县| 东宁县| 兰西县| 海门市| 谢通门县| 岳普湖县| 昔阳县| 汉阴县| 黄梅县| 深州市| 邵武市| 宁海县| 乌什县| 龙游县| 凌云县| 修水县| 海丰县|