官术网_书友最值得收藏!

The benefits of EDA across vertical markets

Every organization today produces and relies on a lot of data in their everyday processes. Before making assumptions and decisions based on this data, organizations need to be able to understand it. EDA enables data analysts and data scientists to bring this information to the right people. It is the most important step on which a data-driven organization should focus its energy and resources.

Having practical tools in hand for carrying out EDA helps data analysts and data scientists produce reproducible and knowledgeable data analysis results. R is one of the most popular data analysis environments, so it makes sense to equip your data analysis teams with powerful R techniques to make the most of their EDA skills.

At the time of writing this book, there are more than 13,000 R packages available according to CRAN. You can get R packages for all kinds of tasks and domains. For our purpose, we will be concentrating on a particular set of R packages that are considered the best by the R community for the purpose of EDA. Some of the packages that we are going to cover may not be directly related to EDA, but they are relevant for other stages of dealing with the data, as indicated by the following diagram:

We will introduce these packages briefly in this chapter and go into more detail as the book progresses. The different stages are as mentioned as follows:

  • Pre Modeling Stage: This stage involves the manipulation of the data frame based on Data Visualization, Data Transformation, Missing Value Imputations, Outlier Detection, Feature Selection, and Dimension Reduction.
  • Modeling Stage: This stage is considered as an intermediate stage that involves Continuous Regression, Ordinal Regression, Classification, Clustering, and Time Series with Survival.
  • Post Modeling Stage: This stage is considered as a final stage where only output interpretation is considered on high priority. It includes the implementation of various algorithms such as clustering, classification, and regression.
主站蜘蛛池模板: 樟树市| 米泉市| 精河县| 四川省| 商洛市| 宾阳县| 玉田县| 淅川县| 静宁县| 三门峡市| 武功县| 三江| 余姚市| 平利县| 宽甸| 长兴县| 德清县| 乐安县| 北辰区| 高唐县| 和田县| 石河子市| 乡城县| 木里| 仁寿县| 饶河县| 盐山县| 巴林左旗| 青神县| 宜都市| 吉安县| 柏乡县| 罗江县| 隆尧县| 乐安县| 衡阳县| 汉中市| 庆元县| 大理市| 霞浦县| 从化市|