官术网_书友最值得收藏!

Inputting data using Python

Similarly, we can use Python to retrieve the data, as shown in the code here:

import pandas as pd 
path="http://archive.ics.uci.edu/ml/machine-learning-databases/" 
dataset="iris/bezdekIris.data" 
inFile=path+dataset 
data=pd.read_csv(inFile,header=None) 
data.columns=["sepalLength","sepalWidth","petalLength","petalWidth","Class"] 

After retrieving data, the print(data.head(2)) function can be used to see the first two instances:

> print(data.head(2)) 
sepalLength sepalWidth petalLength petalWidth Class 0 5.1 3.5 1.4 0.2 Iris-setosa 1 4.9 3.0 1.4 0.2 Iris-setosa

When typing pd.read.csv(), we can find the definitions of all input variables, shown in the following screenshot. Again, to save space, only the first several input variables are shown:

To prevent a future potential change in terms of a dataset link, we have a backup dataset located at the author's website, shown in the following Python code:

inFile="http://canisius.edu/~yany/data/bezdekIris.data.txt" 
import pandas as pd 
d=pd.read_csv(inFile,header=None) 

The following table shows several functions included in the pandas package that we could use to retrieve data:

Table 3.4 Functions included in the Python pandas module for inputting data

To find out detailed information on each of the preceding functions, we use the help() function. For example, if we want to get more information about the read_sas() function, we issue the following commands:

import pandas as pd 
help(pd.read_sas) 

The corresponding output, the top part only, is shown here:

主站蜘蛛池模板: 九寨沟县| 沙洋县| 察隅县| 辽阳市| 崇左市| 师宗县| 常熟市| 包头市| 七台河市| 启东市| 泰宁县| 将乐县| 微博| 平陆县| 吴旗县| 衡阳市| 石楼县| 凤山市| 施甸县| 龙岩市| 湘乡市| 卢龙县| 武功县| 安平县| 三原县| 贵溪市| 阿拉尔市| 兴安县| 绍兴市| 六安市| 永胜县| 巩留县| 肇源县| 广安市| 静宁县| 郧西县| 龙川县| 寿光市| 苏尼特右旗| 浪卡子县| 东城区|