- Machine Learning with scikit:learn Quick Start Guide
- Kevin Jolly
- 233字
- 2021-06-24 18:15:55
Preparing a dataset for machine learning with scikit-learn
The first step to implementing any machine learning algorithm with scikit-learn is data preparation. Scikit-learn comes with a set of constraints to implementation that will be discussed later in this section. The dataset that we will be using is based on mobile payments and is found on the world's most popular competitive machine learning website – Kaggle.
You can download the dataset from: https://www.kaggle.com/ntnu-testimon/paysim1.
Once downloaded, open a new Jupyter Notebook by using the following code in Terminal (macOS/Linux) or Anaconda Prompt/PowerShell (Windows):
Jupyter Notebook
The fundamental goal of this dataset is to predict whether a mobile transaction is fraudulent. In order to do this, we need to first have a brief understanding of the contents of our data. In order to explore the dataset, we will use the pandas package in Python. You can install pandas by using the following code in Terminal (macOS/Linux) or PowerShell (Windows):
pip3 install pandas
Pandas can be installed on Windows machines in an Anaconda Prompt by using the following code:
conda install pandas
We can now read in the dataset into our Jupyter Notebook by using the following code:
#Package Imports
import pandas as pd
#Reading in the dataset
df = pd.read_csv('PS_20174392719_1491204439457_log.csv')
#Viewing the first 5 rows of the dataset
df.head()
This produces an output as illustrated in the following screenshot:
- Mastering Proxmox(Third Edition)
- Java開(kāi)發(fā)技術(shù)全程指南
- 智能工業(yè)報(bào)警系統(tǒng)
- SharePoint 2010開(kāi)發(fā)最佳實(shí)踐
- 80x86/Pentium微型計(jì)算機(jī)原理及應(yīng)用
- OpenStack Cloud Computing Cookbook(Second Edition)
- Cloudera Administration Handbook
- 大數(shù)據(jù)驅(qū)動(dòng)的機(jī)械裝備智能運(yùn)維理論及應(yīng)用
- 網(wǎng)絡(luò)存儲(chǔ)·數(shù)據(jù)備份與還原
- 在實(shí)戰(zhàn)中成長(zhǎng):C++開(kāi)發(fā)之路
- Advanced Deep Learning with Keras
- Generative Adversarial Networks Projects
- Eclipse RCP應(yīng)用系統(tǒng)開(kāi)發(fā)方法與實(shí)戰(zhàn)
- MySQL Management and Administration with Navicat
- 巧學(xué)活用Linux