- Mastering Machine Learning with R
- Cory Lesmeister
- 220字
- 2021-07-02 13:46:18
Preparing and Understanding Data
Research consistently shows that machine learning and data science practitioners spend most of their time manipulating data and preparing it for analysis. Indeed, many find it the most tedious and least enjoyable part of their work. Numerous companies are offering solutions to the problem but, in my opinion, results at this point are varied. Therefore, in this first chapter, I shall endeavor to provide a way of tackling the problem that will ease the burden of getting your data ready for machine learning. The methodology introduced in this chapter will serve as the foundation for data preparation and for understanding many of the subsequent chapters. I propose that once you become comfortable with this tried and true process, it may very well become your favorite part of machine learning—as it is for me.
The following are the topics that we'll cover in this chapter:
- Overview
- Reading the data
- Handling duplicate observations
- Descriptive statistics
- Exploring categorical variables
- Handling missing values
- Zero and near-zero variance features
- Treating the data
- Correlation and linearity
- PostgreSQL 11 Server Side Programming Quick Start Guide
- 大數(shù)據(jù)專業(yè)英語
- 基于LPC3250的嵌入式Linux系統(tǒng)開發(fā)
- 實(shí)時(shí)流計(jì)算系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)
- 蕩胸生層云:C語言開發(fā)修行實(shí)錄
- iClone 4.31 3D Animation Beginner's Guide
- 智能生產(chǎn)線的重構(gòu)方法
- 教育機(jī)器人的風(fēng)口:全球發(fā)展現(xiàn)狀及趨勢(shì)
- Apache源代碼全景分析(第1卷):體系結(jié)構(gòu)與核心模塊
- 會(huì)聲會(huì)影X4中文版從入門到精通
- Mastering Text Mining with R
- 三菱FX/Q系列PLC工程實(shí)例詳解
- 單片機(jī)C51應(yīng)用技術(shù)
- ARM體系結(jié)構(gòu)與編程
- fastText Quick Start Guide