- Hands-On Exploratory Data Analysis with Python
- Suresh Kumar Mukhiya Usman Ahmed
- 274字
- 2021-06-24 16:44:46
Exploratory Data Analysis Fundamentals
The main objective of this introductory chapter is to revise the fundamentals of Exploratory Data Analysis (EDA), what it is, the key concepts of profiling and quality assessment, the main dimensions of EDA, and the main challenges and opportunities in EDA.
Data encompasses a collection of discrete objects, numbers, words, events, facts, measurements, observations, or even descriptions of things. Such data is collected and stored by every event or process occurring in several disciplines, including biology, economics, engineering, marketing, and others. Processing such data elicits useful information and processing such information generates useful knowledge. But an important question is: how can we generate meaningful and useful information from such data? An answer to this question is EDA. EDA is a process of examining the available dataset to discover patterns, spot anomalies, test hypotheses, and check assumptions using statistical measures. In this chapter, we are going to discuss the steps involved in performing top-notch exploratory data analysis and get our hands dirty using some open source databases.
As mentioned here and in several studies, the primary aim of EDA is to examine what data can tell us before actually going through formal modeling or hypothesis formulation. John Tuckey promoted EDA to statisticians to examine and discover the data and create newer hypotheses that could be used for the development of a newer approach in data collection and experimentations.
In this chapter, we are going to learn and revise the following topics:
Understanding data science
The significance of EDA
Making sense of data
Comparing EDA with classical and Bayesian analysis
Software tools available for EDA
Getting started with EDA
- Spring 5.0 Microservices(Second Edition)
- Oracle數據庫從入門到運維實戰
- Flash CS6中文版應用教程(第三版)
- Nginx Essentials
- OpenShift在企業中的實踐:PaaS DevOps微服務(第2版)
- Linux Device Drivers Development
- Learning Three.js:The JavaScript 3D Library for WebGL
- Kotlin從基礎到實戰
- 蘋果的產品設計之道:創建優秀產品、服務和用戶體驗的七個原則
- Python機器學習:預測分析核心算法
- Go語言從入門到精通
- Java從入門到精通(視頻實戰版)
- 大話程序員:從入門到優秀全攻略
- Learning Redis
- FORTRAN程序設計權威指南