官术网_书友最值得收藏!

Chapter 2. Exploratory Data Analysis

Exploratory data analysis is a very important topic in the field of data analysis. It is an approach of analyzing the data and summarizing the main characteristics of the dataset. The main objective of exploratory data analysis is to check various hypotheses in order to get a better understanding about the dataset.

Exploratory data analysis includes many statistical techniques and visual and nonvisual analysis. When your study has to be communicated with peers as well as with other audience with non-data science backgrounds, it is advisable to use a lot of visual techniques that help in better communications.

Some of the expectations out of exploratory data analysis are getting insights out of the data, extracting the important variables in the dataset (depending on the problem to be solved), identifying the outliers in the data, and getting results of various testing hypotheses. These results play a very important role in how to solve the business problems, and if it is a modeling problem, then deciding on which model to use and how to apply it to the dataset for enhanced accuracy.

In this chapter, you will learn how to perform exploratory data analysis starting with getting a generalized view on the data, analysis of one variable at a time, then bi-variable analysis, and finally, analyzing multiple variables to get a better understanding on interdependencies.

The topics that will be covered in this chapter are as follows:

  • Titanic dataset
  • Descriptive statistics
  • Inferential statistics
  • Univariate analysis
  • Bivariate analysis
  • Multivariate analysis (scatter plot with segments, heatmap, and tabulation)
主站蜘蛛池模板: 安龙县| 逊克县| 阿克| 眉山市| 株洲县| 仪征市| 滦平县| 观塘区| 张家界市| 平昌县| 湖南省| 东城区| 江津市| 中阳县| 甘泉县| 灌南县| 东山县| 康定县| 吴川市| 固原市| 江都市| 马山县| 拉孜县| 崇明县| 重庆市| 高碑店市| 新郑市| 广州市| 武平县| 太和县| 木兰县| 房产| 方山县| 新建县| 铅山县| 郯城县| 漳浦县| 桐乡市| 巨鹿县| 紫阳县| 长宁县|