- Hands-On Data Science with Anaconda
- Dr. Yuxing Yan James Yan
- 85字
- 2021-06-25 21:08:51
Generating Python datasets
To generate a Python dataset, we use the Pandas to_pickle functionality. The dataset we plan to use is called adult.pkl, as shown in the following screenshot:

The related Python code is given here:
import pandas as pd path="http://archive.ics.uci.edu/ml/machine-learning-databases/" dataSet="adult/adult.data" inFile=path+dataSet x=pd.read_csv(inFile,header=None) adult=pd.DataFrame(x,index=None) adult= adult.rename(columns={0:'age',1: 'workclass', 2:'fnlwgt',3:'education',4:'education-num', 5:'marital-status',6:'occupation',7:'relationship', 8:'race',9:'sex',10:'capital-gain',11:'capital-loss', 12:'hours-per-week',13:'native-country',14:'class'}) adult.to_pickle("c:/temp/adult.pkl")
To show the first several lines of observations, we use the x.head() functionality, shown in the following screenshot:

Note that the backup dataset is available at the author's website, downloadable at http://canisius.edu/~yany/data/adult.data.txt.
推薦閱讀
- 樂高機器人EV3設計指南:創造者的搭建邏輯
- Dreamweaver CS3網頁設計50例
- 流處理器研究與設計
- 計算機網絡應用基礎
- 工業機器人現場編程(FANUC)
- 水晶石精粹:3ds max & ZBrush三維數字靜幀藝術
- Implementing Oracle API Platform Cloud Service
- C語言寶典
- Dreamweaver CS6中文版多功能教材
- 工業自動化技術實訓指導
- Working with Linux:Quick Hacks for the Command Line
- 中文版AutoCAD 2013高手速成
- 筆記本電腦維修之電路分析基礎
- Cortex-M3嵌入式處理器原理與應用
- 網絡工程師必讀:網絡安全系統設計