官术网_书友最值得收藏!

The significance of EDA

Different fields of science, economics, engineering, and marketing accumulate and store data primarily in electronic databases. Appropriate and well-established decisions should be made using the data collected. It is practically impossible to make sense of datasets containing more than a handful of data points without the help of computer programs. To be certain of the insights that the collected data provides and to make further decisions, data mining is performed where we go through distinctive analysis processes. Exploratory data analysis is key, and usually the first exercise in data mining. It allows us to visualize data to understand it as well as to create hypotheses for further analysis. The exploratory analysis centers around creating a synopsis of data or insights for the next steps in a data mining project.

EDA actually reveals ground truth about the content without making any underlying assumptions. This is the fact that data scientists use this process to actually understand what type of modeling and hypotheses can be created. Key components of exploratory data analysis include summarizing data, statistical analysis, and visualization of data. Python provides expert tools for exploratory analysis, with pandas for summarizing; scipy, along with others, for statistical analysis; and matplotlib and plotly for visualizations.

That makes sense, right? Of course it does. That is one of the reasons why you are going through this book. After understanding the significance of EDA, let's discover what are the most generic steps involved in EDA in the next section.

主站蜘蛛池模板: 五峰| 河西区| 九龙坡区| 黑水县| 晋中市| 天镇县| 曲阜市| 四平市| 蛟河市| 沧源| 巩留县| 新乡县| 宣武区| 花垣县| 台北市| 光泽县| 饶阳县| 扶沟县| 苏尼特左旗| 顺平县| 樟树市| 理塘县| 开远市| 海门市| 广昌县| 宜州市| 贵南县| 双鸭山市| 嘉荫县| 金乡县| 新民市| 呼玛县| 阿拉善盟| 白河县| 陆丰市| 上犹县| 蒙自县| 三穗县| 宽甸| 哈密市| 桓台县|