官术网_书友最值得收藏!

Outliers

An outlier is an observation that lies an unusual distance from other observations. There is a judgmental element in deciding what is considered unusual, and it helps to work with the subject-matter expert in deciding this. In exploratory data analysis, there are two activities that are linked:

  • Examining the overall shape of the graphed data for important features
  • Examining the data for unusual observations that are far from the mass or general trend of the data

Outliers are data points that deserve a closer look. The values could be real data values accurately recorded or the values could be misrecorded or otherwise flawed data. You need to discern what is the case in your situation and decide what action to take.

In this section, we consider statistical and graphical ways of summarizing the distribution of a variable and detecting unusual/extreme values. IBM SPSS Statistics provides many tools for this, which are found in procedures such as Frequencies, Examine, and Chart Builder. To explore these facilities, we introduce data on used Toyota Corollas and, in particular, look at the distribution of the offer prices, in Euros, of sales in the Netherlands in the year 2004. 

The Toyota Corolla data featured in this chapter is described in  Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner(R) , Third Edition. Galit Shmueli, Peter C. Bruce, and Nitin R. Patel. Copyright 2016 John Wiley and Sons.
主站蜘蛛池模板: 荆门市| 江门市| 北票市| 新乐市| 灌阳县| 濮阳市| 石泉县| 同江市| 云浮市| 邯郸县| 绥芬河市| 石棉县| 乳源| 桐乡市| 韶关市| 耒阳市| 南部县| 兰州市| 雅安市| 修文县| 鄯善县| 阜城县| 二手房| 南投县| 尼勒克县| 莫力| 长寿区| 丰顺县| 阳谷县| 兖州市| 勐海县| 柳州市| 葫芦岛市| 汽车| 宜宾市| 阿拉善右旗| 繁峙县| 曲靖市| 汾西县| 梧州市| 昌平区|