- Ensemble Machine Learning Cookbook
- Dipayan Sarkar Vijayalakshmi Natarajan
- 151字
- 2021-07-02 13:21:58
Getting ready
In Chapter 1, Get Closer to your Data, we manipulated and prepared the data from the HousePrices.csv file and dealt with the missing values. In this example, we're going to use the final dataset to demonstrate these sampling and resampling techniques.
You can get the prepared dataset from the GitHub.
We'll import the required libraries. We'll read the data and take a look at the dimensions of our dataset:
# import os for operating system dependent functionalities
import os
# import other required libraries
import pandas as pd
from sklearn.model_selection import train_test_split
# Set your working directory according to your requirement
os.chdir(".../Chapter 3/Resampling Methods")
os.getcwd()
Let's read our data. We'll prefix the DataFrame name with df_ to make it easier to understand:
df_housingdata = pd.read_csv("Final_HousePrices.csv")
In the next section, we'll look at how to use train_test_split() from sklean.model_selection to split our data into random training and testing subsets.
推薦閱讀
- Dreamweaver CS3 Ajax網頁設計入門與實例詳解
- Cinema 4D R13 Cookbook
- Dreamweaver CS3網頁制作融會貫通
- 走入IBM小型機世界
- AWS Administration Cookbook
- RPA(機器人流程自動化)快速入門:基于Blue Prism
- 基于單片機的嵌入式工程開發詳解
- 工業機器人安裝與調試
- Hadoop應用開發基礎
- Microsoft System Center Confi guration Manager
- 網絡管理工具實用詳解
- SQL Server數據庫應用基礎(第2版)
- 筆記本電腦維修之電路分析基礎
- Data Analysis with R(Second Edition)
- Cloudera Hadoop大數據平臺實戰指南