官术网_书友最值得收藏!

Exploratory Data Analysis Fundamentals

The main objective of this introductory chapter is to revise the fundamentals of Exploratory Data Analysis (EDA), what it is, the key concepts of profiling and quality assessment, the main dimensions of EDA, and the main challenges and opportunities in EDA.  

Data encompasses a collection of discrete objects, numbers, words, events, facts, measurements, observations, or even descriptions of things. Such data is collected and stored by every event or process occurring in several disciplines, including biology, economics, engineering, marketing, and others. Processing such data elicits useful information and processing such information generates useful knowledge. But an important question is: how can we generate meaningful and useful information from such data? An answer to this question is EDA. EDA is a process of examining the available dataset to discover patterns, spot anomalies, test hypotheses, and check assumptions using statistical measures. In this chapter, we are going to discuss the steps involved in performing top-notch exploratory data analysis and get our hands dirty using some open source databases.

As mentioned here and in several studies, the primary aim of EDA is to examine what data can tell us before actually going through formal modeling or hypothesis formulation. John Tuckey promoted EDA to statisticians to examine and discover the data and create newer hypotheses that could be used for the development of a newer approach in data collection and experimentations. 

In this chapter, we are going to learn and revise the following topics:

Understanding data science

The significance of EDA

Making sense of data

Comparing EDA with classical and Bayesian analysis

Software tools available for EDA

Getting started with EDA

主站蜘蛛池模板: 资溪县| 习水县| 洞头县| 图们市| 民县| 双桥区| 郧西县| 福建省| 江都市| 保定市| 扶余县| 蚌埠市| 天镇县| 贡觉县| 汕头市| 京山县| 崇阳县| 安远县| 西宁市| 密云县| 高台县| 通江县| 宝山区| 玛多县| 本溪市| 新安县| 中超| 象州县| 瑞丽市| 嘉善县| 山西省| 视频| 渭南市| 和田县| 怀来县| 乌兰察布市| 来凤县| 唐河县| 浠水县| 青田县| 财经|