官术网_书友最值得收藏!

Descriptive statistics for numeric fields

The descriptives procedure in SPSS Statistics provides you with an easy way to get a comprehensive picture of all the numeric fields in a dataset. As was noted in Chapter 2, Accessing and Organizing Data, the way in which a field is coded determines how it can be used in SPSS Statistics. Data fields coded with characters will not be available for use in the Descriptives dialog as it produces summary statistics only. Text fields in your data will need to be examined using a different approach, which will be covered next section of this chapter.

To obtain a table with all the numeric fields from your data along with some basic information such as the count, mean, and standard deviation, select Descriptive Statistics under the Analyze menu and click on the second choice, Descriptives. Highlight the first field--which in this dataset is Age--scroll down to the last field listed on the left, VOTE OBAMA OR ROMNEY [PRES12], and use Shift-Click to select all fields.

Click on the arrow in the middle of the dialog to move the list to the box on the left, as shown in the following image, and then click on OK:

The descriptive statistics for the 28 fields in this dataset are displayed in following screenshot. One of the first pieces of information to check is the N, which indicates how many of the rows contain a valid code for each field. For the 2016 General Social Survey data, the maximum value of N is 2,867 and it is evident that most of the fields are close to this number with a few exceptions. Questions in the survey tare dependent on a person's marital status, such as Happiness of Marriage and the items related to spouse's education, so it makes sense that the N for these fields would be lower.

A check of the Marital Status field specifically (using the frequencies procedure) can be used to confirm the number of married individuals in this dataset. The VOTE OBAMA OR ROMNEY field also has a smaller N value but this question is only asked of individuals that voted in the 2012 election. Checking the DID R VOTE IN 2012 ELECTION field is a way to confirm that this N is correct.

For some fields, such as age and years of school completed, the minimum, maximum, and mean values provide useful information as they can be interpreted directly. In this survey, only individuals in the 18 to 89 age range were included and the mean age of the group was 49.

In general, however, the numeric values used for questions such as marital status or region are associated with categories relevant to the item so the minimum, maximum, and mean are not particularly useful except to provide a sense of the range of values in the data. At the bottom of the table, there is Valid N (listwise), which indicates how many of the 2,867 individuals surveyed had a valid value for each of the 28 questions in the table. This number can be very helpful, especially when selecting fields to use in multivariate analysis.

Here, it is useful to note that while the smallest N value for the 28 fields is 1,195, only 422 of those surveyed had a valid value on all the questions. This illustrates how absent information can dramatically reduce the number of rows available for use in analysis. Strategies to deal with missing data will be covered in a later chapter, but descriptive statistics is an important means of identifying the magnitude of the challenge before embarking on a more detailed investigation of the data:

主站蜘蛛池模板: 井陉县| 桦甸市| 花垣县| 屯昌县| 迭部县| 大关县| 北安市| 习水县| 依安县| 柯坪县| 蓬溪县| 定安县| 循化| 康保县| 棋牌| 龙陵县| 金湖县| 石城县| 西乌珠穆沁旗| 邵武市| 沂源县| 华池县| 新绛县| 延安市| 永新县| 文山县| 台中县| 佳木斯市| 老河口市| 修文县| 德安县| 高雄县| 鹤岗市| 苍山县| 富裕县| 东乌| 滨海县| 二手房| 华阴市| 正安县| 晋城|