- Machine Learning with scikit:learn Quick Start Guide
- Kevin Jolly
- 158字
- 2021-06-24 18:15:55
Dropping features that are redundant
From the dataset seen previously, there are a few columns that are redundant to the machine learning process:
- nameOrig: This column is a unique identifier that belongs to each customer. Since each identifier is unique with every row of the dataset, the machine learning algorithm will not be able to discern any patterns from this feature.
- nameDest: This column is also a unique identifier that belongs to each customer and as such provides no value to the machine learning algorithm.
- isFlaggedFraud: This column flags a transaction as fraudulent if a person tries to transfer more than 200,000 in a single transaction. Since we already have a feature called isFraud that flags a transaction as fraud, this feature becomes redundant.
We can drop these features from the dataset by using the following code:
#Dropping the redundant features
df = df.drop(['nameOrig', 'nameDest', 'isFlaggedFraud'], axis = 1)
推薦閱讀
- 工業(yè)機(jī)器人虛擬仿真實(shí)例教程:KUKA.Sim Pro(全彩版)
- 計(jì)算機(jī)圖形學(xué)
- 協(xié)作機(jī)器人技術(shù)及應(yīng)用
- TestStand工業(yè)自動(dòng)化測(cè)試管理(典藏版)
- UTM(統(tǒng)一威脅管理)技術(shù)概論
- JMAG電機(jī)電磁仿真分析與實(shí)例解析
- 大數(shù)據(jù)處理平臺(tái)
- 自動(dòng)控制理論(非自動(dòng)化專業(yè))
- 愛犯錯(cuò)的智能體
- 單片機(jī)技術(shù)一學(xué)就會(huì)
- MCGS嵌入版組態(tài)軟件應(yīng)用教程
- Statistics for Data Science
- 軟件工程及實(shí)踐
- Learn QGIS
- 筆記本電腦維修之電路分析基礎(chǔ)