官术网_书友最值得收藏!

Exploratory Data Analysis Fundamentals

The main objective of this introductory chapter is to revise the fundamentals of Exploratory Data Analysis (EDA), what it is, the key concepts of profiling and quality assessment, the main dimensions of EDA, and the main challenges and opportunities in EDA.  

Data encompasses a collection of discrete objects, numbers, words, events, facts, measurements, observations, or even descriptions of things. Such data is collected and stored by every event or process occurring in several disciplines, including biology, economics, engineering, marketing, and others. Processing such data elicits useful information and processing such information generates useful knowledge. But an important question is: how can we generate meaningful and useful information from such data? An answer to this question is EDA. EDA is a process of examining the available dataset to discover patterns, spot anomalies, test hypotheses, and check assumptions using statistical measures. In this chapter, we are going to discuss the steps involved in performing top-notch exploratory data analysis and get our hands dirty using some open source databases.

As mentioned here and in several studies, the primary aim of EDA is to examine what data can tell us before actually going through formal modeling or hypothesis formulation. John Tuckey promoted EDA to statisticians to examine and discover the data and create newer hypotheses that could be used for the development of a newer approach in data collection and experimentations. 

In this chapter, we are going to learn and revise the following topics:

Understanding data science

The significance of EDA

Making sense of data

Comparing EDA with classical and Bayesian analysis

Software tools available for EDA

Getting started with EDA

主站蜘蛛池模板: 牟定县| 盘锦市| 五河县| 南漳县| 新宾| 海伦市| 聊城市| 大连市| 桂平市| 三都| 桦南县| 库伦旗| 旌德县| 樟树市| 奈曼旗| 通州区| 大厂| 江川县| 浪卡子县| 福贡县| 龙州县| 安丘市| 天门市| 昆山市| 乌拉特后旗| 清流县| 山阴县| 井研县| 个旧市| 怀安县| 宿松县| 青海省| 平阳县| 罗甸县| 永靖县| 洮南市| 监利县| 奉化市| 武宣县| 合山市| 射洪县|