官术网_书友最值得收藏!

Using explore to check subgroup patterns

While explore is useful for looking at the distribution of individual fields, it is particularly helpful for the investigation of patterns across subsets of the data. We'll look at an example of this approach next. Go back to the Explore dialog box, the HIGHEST YEAR OF SCHOOL COMPLETED field should still be in the upper Dependent List box (if not, add it). In the lower Factor List, add REGION OF INTERVIEW and click on OK.

The descriptives produced by explore now contain a separate set of results for each of the nine regions used to group the states for the purposes of the survey. Values for New England (Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, and Vermont) are shown first (see Figure 12) as this region is coded with the value 1 in the data.

This area of the US is relatively well-educated as can be seen by the mean (14.29) and median (14) values in the table:

By comparison, the West South Central region (Arkansas, Louisiana, Oklahoma, and Texas), which is coded 7 in the data, has a lower mean (12.91) and median (12) years of schooling:

The stem and leaf plot for the New England region (see the figure below) indicates that there are only two extreme values and a large proportion of individuals with 14 and 16 years of education:


The corresponding plot for the West South Central region, shown in the following figure, has 19 extreme values at the lower end, 8 or fewer years, and another 19 extreme values at the higher end, 18 or more years of schooling. It is also evident that in this area of the US, people very often finish their education after 12 years when they complete high school:

The boxplot (following figure) included in the explore output provides an excellent visual depiction of the pattern across the groups and highlights potential areas to address in terms of the distribution of education. At a glance, one can see that five of the regions (New England, Middle Atlantic, South Atlantic, Mountain, and Pacific) have a similar pattern in terms of the median (14), size of the box, and small number of extreme values. By contrast, the West North Central and West South Central regions have a lower median value (12), a smaller box indicating a concentration of values just above the median, and several extreme values at both the top and bottom. These patterns are important because the variance across, groups involved in an analysis is assumed to be consistent and, when that is not the case, it can cause problems. The boxplot is a convenient means of comparing the variability of the subgroups in the data visually on a single page:

The vertical axis was modified to add more values. Chapter 5, Visually Exploring the Data, will discuss how to modify the charts produced by SPSS.
主站蜘蛛池模板: 茶陵县| 黑水县| 崇仁县| 青阳县| 泽库县| 西吉县| 东丰县| 井陉县| 陈巴尔虎旗| 香格里拉县| 涞水县| 康平县| 平谷区| 灵寿县| 盐池县| 临沧市| 彝良县| 株洲市| 黑山县| 邢台县| 洪湖市| 新密市| 和田县| 邵阳县| 湘西| 灵宝市| 保定市| 湖州市| 噶尔县| 天津市| 鄄城县| 左贡县| 藁城市| 巨野县| 泸州市| 无极县| 闻喜县| 桦川县| 阳西县| 旺苍县| 绵竹市|