官术网_书友最值得收藏!

Chapter 2. Exploratory Data Analysis

Exploratory data analysis is a very important topic in the field of data analysis. It is an approach of analyzing the data and summarizing the main characteristics of the dataset. The main objective of exploratory data analysis is to check various hypotheses in order to get a better understanding about the dataset.

Exploratory data analysis includes many statistical techniques and visual and nonvisual analysis. When your study has to be communicated with peers as well as with other audience with non-data science backgrounds, it is advisable to use a lot of visual techniques that help in better communications.

Some of the expectations out of exploratory data analysis are getting insights out of the data, extracting the important variables in the dataset (depending on the problem to be solved), identifying the outliers in the data, and getting results of various testing hypotheses. These results play a very important role in how to solve the business problems, and if it is a modeling problem, then deciding on which model to use and how to apply it to the dataset for enhanced accuracy.

In this chapter, you will learn how to perform exploratory data analysis starting with getting a generalized view on the data, analysis of one variable at a time, then bi-variable analysis, and finally, analyzing multiple variables to get a better understanding on interdependencies.

The topics that will be covered in this chapter are as follows:

  • Titanic dataset
  • Descriptive statistics
  • Inferential statistics
  • Univariate analysis
  • Bivariate analysis
  • Multivariate analysis (scatter plot with segments, heatmap, and tabulation)
主站蜘蛛池模板: 静安区| 淳安县| 云南省| 平乐县| 仁寿县| 鄂尔多斯市| 阳谷县| 抚州市| 察隅县| 佳木斯市| 长寿区| 深圳市| 前郭尔| 内乡县| 陆川县| 南安市| 西昌市| 巴青县| 西昌市| 奉新县| 南部县| 滕州市| 金门县| 闽清县| 灵丘县| 通化市| 高阳县| 江孜县| 类乌齐县| 来宾市| 石渠县| 临泉县| 江城| 宁安市| 海盐县| 化隆| 敖汉旗| 托克逊县| 麟游县| 繁峙县| 吴川市|