- RStudio for R Statistical Computing Cookbook
- Andrea Cirillo
- 175字
- 2021-07-16 11:04:00
Introduction
Some studies estimate that data preparation activities account for 80 percent of the time invested in data science projects.
I know you will not be surprised reading this number. Data preparation is the phase in data science projects where you take your data from the chaotic world around you and fit it into some precise structures and standards.
This is absolutely not a simple task and involves a great number of techniques that basically let you change the structure of your data and ensure you can work with it.
This chapter will show you recipes that should give you the ability to prepare the data you got from the previous chapter, no matter how it was structured when you acquired it in R.
We will look at the two main activities performed during the data preparation phase:
- Data cleansing: This involves identification and treatment of outliers and missing values
- Data manipulation: Here, the main aim is to make the data structure fit some specific rule, which will let the user employ it for analysis
- 案例式C語言程序設計
- Learning Selenium Testing Tools with Python
- Flink SQL與DataStream入門、進階與實戰
- 網頁設計與制作教程(HTML+CSS+JavaScript)(第2版)
- C語言程序設計學習指導與習題解答
- Statistical Application Development with R and Python(Second Edition)
- R數據科學實戰:工具詳解與案例分析
- Learning Material Design
- Mastering Concurrency Programming with Java 9(Second Edition)
- NGUI for Unity
- 基于GPU加速的計算機視覺編程:使用OpenCV和CUDA實時處理復雜圖像數據
- SQL Server on Linux
- Java 7 Concurrency Cookbook
- 從Excel到Python:用Python輕松處理Excel數據
- Python機器學習(原書第3版)