- Machine Learning with scikit:learn Quick Start Guide
- Kevin Jolly
- 158字
- 2021-06-24 18:15:55
Dropping features that are redundant
From the dataset seen previously, there are a few columns that are redundant to the machine learning process:
- nameOrig: This column is a unique identifier that belongs to each customer. Since each identifier is unique with every row of the dataset, the machine learning algorithm will not be able to discern any patterns from this feature.
- nameDest: This column is also a unique identifier that belongs to each customer and as such provides no value to the machine learning algorithm.
- isFlaggedFraud: This column flags a transaction as fraudulent if a person tries to transfer more than 200,000 in a single transaction. Since we already have a feature called isFraud that flags a transaction as fraud, this feature becomes redundant.
We can drop these features from the dataset by using the following code:
#Dropping the redundant features
df = df.drop(['nameOrig', 'nameDest', 'isFlaggedFraud'], axis = 1)
推薦閱讀
- 計算機應用
- OpenStack for Architects
- Effective DevOps with AWS
- 系統安裝與重裝
- 工業機器人應用案例集錦
- Google SketchUp for Game Design:Beginner's Guide
- Python:Data Analytics and Visualization
- Practical Big Data Analytics
- Applied Data Visualization with R and ggplot2
- Mastering GitLab 12
- R Data Analysis Projects
- 從零開始學JavaScript
- 3ds Max造型表現藝術
- 計算智能算法及其生產調度應用
- Microsoft Dynamics CRM 2013 Marketing Automation