官术网_书友最值得收藏!

Making sense of data

It is crucial to identify the type of data under analysis. In this section, we are going to learn about different types of data that you can encounter during analysis. Different disciplines store different kinds of data for different purposes. For example, medical researchers store patients' data, universities store students' and teachers' data, and real estate industries storehouse and building datasets. A dataset contains many observations about a particular object. For instance, a dataset about patients in a hospital can contain many observations. A patient can be described by a patient identifier (ID), name, address, weight, date of birth, address, email, and gender. Each of these features that describes a patient is a variable. Each observation can have a specific value for each of these variables. For example, a patient can have the following:

PATIENT_ID = 1001
Name = Yoshmi Mukhiya
Address = Mannsverk 61, 5094, Bergen, Norway
Date of birth = 10th July 2018
Email = yoshmimukhiya@gmail.com
Weight = 10
Gender = Female

These datasets are stored in hospitals and are presented for analysis. Most of this data is stored in some sort of database management system in tables/schema. An example of a table for storing patient information is shown here:

            
PATIENT_ID           NAME           ADDRESS           DOB           EMAIL           Gender           WEIGHT
001           Suresh Kumar Mukhiya           Mannsverk, 61           30.12.1989           skmu@hvl.no           Male           68
002           Yoshmi Mukhiya           Mannsverk 61, 5094, Bergen           10.07.2018           yoshmimukhiya@gmail.com           Female           1
003           Anju Mukhiya           Mannsverk 61, 5094, Bergen           10.12.1997           anjumukhiya@gmail.com           Female           24
004           Asha Gaire           Butwal, Nepal           30.11.1990           aasha.gaire@gmail.com           Female           23
005           Ola Nordmann           Danmark, Sweden           12.12.1789           ola@gmail.com           Male           75

 

To summarize the preceding table, there are four observations (001, 002, 003, 004, 005). Each observation describes variables (PatientID, name, address, dob, email, gender, and weight). Most of the dataset broadly falls into two groups—numerical data and categorical data. 

主站蜘蛛池模板: 保德县| 大名县| 永城市| 信宜市| 合阳县| 洪雅县| 邻水| 涟水县| 乌海市| 南部县| 杭锦旗| 武平县| 马关县| 沂源县| 辛集市| 丽江市| 平塘县| 稷山县| 永德县| 大港区| 巴林左旗| 兴海县| 佳木斯市| 黎川县| 城口县| 安国市| 高青县| 山东省| 确山县| 双牌县| 那曲县| 临安市| 高要市| 江永县| 灌阳县| 靖江市| 朝阳县| 张家川| 三原县| 五家渠市| 江孜县|