- Hands-On Exploratory Data Analysis with Python
- Suresh Kumar Mukhiya Usman Ahmed
- 522字
- 2021-06-24 16:44:47
Making sense of data
It is crucial to identify the type of data under analysis. In this section, we are going to learn about different types of data that you can encounter during analysis. Different disciplines store different kinds of data for different purposes. For example, medical researchers store patients' data, universities store students' and teachers' data, and real estate industries storehouse and building datasets. A dataset contains many observations about a particular object. For instance, a dataset about patients in a hospital can contain many observations. A patient can be described by a patient identifier (ID), name, address, weight, date of birth, address, email, and gender. Each of these features that describes a patient is a variable. Each observation can have a specific value for each of these variables. For example, a patient can have the following:
PATIENT_ID = 1001
Name = Yoshmi Mukhiya
Address = Mannsverk 61, 5094, Bergen, Norway
Date of birth = 10th July 2018
Email = yoshmimukhiya@gmail.com
Weight = 10
Gender = Female
These datasets are stored in hospitals and are presented for analysis. Most of this data is stored in some sort of database management system in tables/schema. An example of a table for storing patient information is shown here:
PATIENT_ID NAME ADDRESS DOB EMAIL Gender WEIGHT
001 Suresh Kumar Mukhiya Mannsverk, 61 30.12.1989 skmu@hvl.no Male 68
002 Yoshmi Mukhiya Mannsverk 61, 5094, Bergen 10.07.2018 yoshmimukhiya@gmail.com Female 1
003 Anju Mukhiya Mannsverk 61, 5094, Bergen 10.12.1997 anjumukhiya@gmail.com Female 24
004 Asha Gaire Butwal, Nepal 30.11.1990 aasha.gaire@gmail.com Female 23
005 Ola Nordmann Danmark, Sweden 12.12.1789 ola@gmail.com Male 75
To summarize the preceding table, there are four observations (001, 002, 003, 004, 005). Each observation describes variables (PatientID, name, address, dob, email, gender, and weight). Most of the dataset broadly falls into two groups—numerical data and categorical data.
- 自制編譯器
- Mastering OpenCV Android Application Programming
- 零基礎學Python網絡爬蟲案例實戰全流程詳解(入門與提高篇)
- Windows內核編程
- Mastering Data Mining with Python:Find patterns hidden in your data
- 微信小程序開發與實戰(微課版)
- 移動互聯網軟件開發實驗指導
- 搞定J2EE:Struts+Spring+Hibernate整合詳解與典型案例
- Node.js開發指南
- 計算機應用基礎項目化教程
- Managing Microsoft Hybrid Clouds
- Hands-On Robotics Programming with C++
- Python硬件編程實戰
- 快樂編程:青少年思維訓練
- Offer來了:Java面試核心知識點精講(框架篇)