- Hands-On Exploratory Data Analysis with Python
- Suresh Kumar Mukhiya Usman Ahmed
- 522字
- 2021-06-24 16:44:47
Making sense of data
It is crucial to identify the type of data under analysis. In this section, we are going to learn about different types of data that you can encounter during analysis. Different disciplines store different kinds of data for different purposes. For example, medical researchers store patients' data, universities store students' and teachers' data, and real estate industries storehouse and building datasets. A dataset contains many observations about a particular object. For instance, a dataset about patients in a hospital can contain many observations. A patient can be described by a patient identifier (ID), name, address, weight, date of birth, address, email, and gender. Each of these features that describes a patient is a variable. Each observation can have a specific value for each of these variables. For example, a patient can have the following:
PATIENT_ID = 1001
Name = Yoshmi Mukhiya
Address = Mannsverk 61, 5094, Bergen, Norway
Date of birth = 10th July 2018
Email = yoshmimukhiya@gmail.com
Weight = 10
Gender = Female
These datasets are stored in hospitals and are presented for analysis. Most of this data is stored in some sort of database management system in tables/schema. An example of a table for storing patient information is shown here:
PATIENT_ID NAME ADDRESS DOB EMAIL Gender WEIGHT
001 Suresh Kumar Mukhiya Mannsverk, 61 30.12.1989 skmu@hvl.no Male 68
002 Yoshmi Mukhiya Mannsverk 61, 5094, Bergen 10.07.2018 yoshmimukhiya@gmail.com Female 1
003 Anju Mukhiya Mannsverk 61, 5094, Bergen 10.12.1997 anjumukhiya@gmail.com Female 24
004 Asha Gaire Butwal, Nepal 30.11.1990 aasha.gaire@gmail.com Female 23
005 Ola Nordmann Danmark, Sweden 12.12.1789 ola@gmail.com Male 75
To summarize the preceding table, there are four observations (001, 002, 003, 004, 005). Each observation describes variables (PatientID, name, address, dob, email, gender, and weight). Most of the dataset broadly falls into two groups—numerical data and categorical data.
- Advanced Machine Learning with Python
- CMDB分步構(gòu)建指南
- 青少年美育趣味課堂:XMind思維導(dǎo)圖制作
- MATLAB應(yīng)用與實(shí)驗(yàn)教程
- Python機(jī)器學(xué)習(xí)編程與實(shí)戰(zhàn)
- Java項(xiàng)目實(shí)戰(zhàn)精編
- 單片機(jī)C語言程序設(shè)計(jì)實(shí)訓(xùn)100例
- Oracle實(shí)用教程
- JQuery風(fēng)暴:完美用戶體驗(yàn)
- 實(shí)戰(zhàn)Python網(wǎng)絡(luò)爬蟲
- Drools 8規(guī)則引擎:核心技術(shù)與實(shí)踐
- Python程序設(shè)計(jì)案例教程:從入門到機(jī)器學(xué)習(xí)(微課版)
- Neo4j High Performance
- ReactJS Blueprints
- Scratch 3少兒交互式游戲編程一本通