官术网_书友最值得收藏!

Inputting data using Python

Similarly, we can use Python to retrieve the data, as shown in the code here:

import pandas as pd 
path="http://archive.ics.uci.edu/ml/machine-learning-databases/" 
dataset="iris/bezdekIris.data" 
inFile=path+dataset 
data=pd.read_csv(inFile,header=None) 
data.columns=["sepalLength","sepalWidth","petalLength","petalWidth","Class"] 

After retrieving data, the print(data.head(2)) function can be used to see the first two instances:

> print(data.head(2)) 
sepalLength sepalWidth petalLength petalWidth Class 0 5.1 3.5 1.4 0.2 Iris-setosa 1 4.9 3.0 1.4 0.2 Iris-setosa

When typing pd.read.csv(), we can find the definitions of all input variables, shown in the following screenshot. Again, to save space, only the first several input variables are shown:

To prevent a future potential change in terms of a dataset link, we have a backup dataset located at the author's website, shown in the following Python code:

inFile="http://canisius.edu/~yany/data/bezdekIris.data.txt" 
import pandas as pd 
d=pd.read_csv(inFile,header=None) 

The following table shows several functions included in the pandas package that we could use to retrieve data:

Table 3.4 Functions included in the Python pandas module for inputting data

To find out detailed information on each of the preceding functions, we use the help() function. For example, if we want to get more information about the read_sas() function, we issue the following commands:

import pandas as pd 
help(pd.read_sas) 

The corresponding output, the top part only, is shown here:

主站蜘蛛池模板: 四平市| 沭阳县| 庄浪县| 右玉县| 射阳县| 元江| 安乡县| 浦江县| 托克托县| 印江| 常德市| 招远市| 宣威市| 吴江市| 合川市| 盐津县| 太仓市| 濮阳市| 金秀| 海阳市| 左贡县| 唐山市| 丹东市| 宜良县| 关岭| 巴楚县| 盘山县| 浮山县| 上虞市| 曲阜市| 苗栗县| 青冈县| 仙居县| 抚顺县| 隆尧县| 玉屏| 林西县| 广元市| 大冶市| 永城市| 子洲县|