官术网_书友最值得收藏!

Categorical data

Earlier, we explained how variables in your data can be either independent or dependent. Another type of variable definition is a categorical variable. This type of variable is one that can take on one of a limited, and typically fixed, number of possible values, thus assigning each individual to a particular category.

Often, the collected data's meaning is unclear. Categorical data is a method that a data scientist can use to put meaning to the data.

For example, if a numeric variable is collected (let's say the values found are 4, 10, and 12), the meaning of the variable becomes clear if the values are categorized. Let's suppose that based upon an analysis of how the data was collected, we can group (or categorize) the data by indicating that this data describes university students, and there is the following number of players:

  • 4 tennis players
  • 10 soccer players
  • 12 football players

Now, because we grouped the data into categories, the meaning becomes clear.

Some other examples of categorized data might be individual pet preferences (grouped by the type of pet), or vehicle ownership (grouped by the style of a car owned), and so on.

So, categorical data, as the name suggests, is data grouped into some sort of category or multiple categories. Some data scientists refer to categories as sub-populations of data.

Categorical data can also be data that is collected as a yes or no answer. For example, hospital admittance data may indicate that patients either smoke or do not smoke.
主站蜘蛛池模板: 岳阳市| 扎鲁特旗| 山西省| 霍林郭勒市| 祁门县| 镇赉县| 江达县| 贡觉县| 叙永县| 阳春市| 台江县| 墨江| 上犹县| 永春县| 雷波县| 云和县| 马关县| 双鸭山市| 肥乡县| 梁河县| 保靖县| 乳山市| 东源县| 莎车县| 若尔盖县| 瑞金市| 布拖县| 沁水县| 大关县| 福清市| 合水县| 宣城市| 长宁县| 宣城市| 布尔津县| 铁岭县| 定西市| 泽州县| 甘泉县| 四子王旗| 黄梅县|