官术网_书友最值得收藏!

The significance of EDA

Different fields of science, economics, engineering, and marketing accumulate and store data primarily in electronic databases. Appropriate and well-established decisions should be made using the data collected. It is practically impossible to make sense of datasets containing more than a handful of data points without the help of computer programs. To be certain of the insights that the collected data provides and to make further decisions, data mining is performed where we go through distinctive analysis processes. Exploratory data analysis is key, and usually the first exercise in data mining. It allows us to visualize data to understand it as well as to create hypotheses for further analysis. The exploratory analysis centers around creating a synopsis of data or insights for the next steps in a data mining project.

EDA actually reveals ground truth about the content without making any underlying assumptions. This is the fact that data scientists use this process to actually understand what type of modeling and hypotheses can be created. Key components of exploratory data analysis include summarizing data, statistical analysis, and visualization of data. Python provides expert tools for exploratory analysis, with pandas for summarizing; scipy, along with others, for statistical analysis; and matplotlib and plotly for visualizations.

That makes sense, right? Of course it does. That is one of the reasons why you are going through this book. After understanding the significance of EDA, let's discover what are the most generic steps involved in EDA in the next section.

主站蜘蛛池模板: 德惠市| 兴文县| 临澧县| 西平县| 南投县| 佛冈县| 长岭县| 遵义县| 凯里市| 突泉县| 墨脱县| 盘山县| 台前县| 镇平县| 凤山市| 樟树市| 无锡市| 金堂县| 桑日县| 达拉特旗| 彰化县| 苏尼特右旗| 金阳县| 石阡县| 奈曼旗| 鸡泽县| 崇明县| 潢川县| 嘉荫县| 枣庄市| 麻城市| 宣恩县| 包头市| 社会| 南川市| 太康县| 忻州市| 措勤县| 荥阳市| 安庆市| 青铜峡市|