官术网_书友最值得收藏!

Inputting data using Python

Similarly, we can use Python to retrieve the data, as shown in the code here:

import pandas as pd 
path="http://archive.ics.uci.edu/ml/machine-learning-databases/" 
dataset="iris/bezdekIris.data" 
inFile=path+dataset 
data=pd.read_csv(inFile,header=None) 
data.columns=["sepalLength","sepalWidth","petalLength","petalWidth","Class"] 

After retrieving data, the print(data.head(2)) function can be used to see the first two instances:

> print(data.head(2)) 
sepalLength sepalWidth petalLength petalWidth Class 0 5.1 3.5 1.4 0.2 Iris-setosa 1 4.9 3.0 1.4 0.2 Iris-setosa

When typing pd.read.csv(), we can find the definitions of all input variables, shown in the following screenshot. Again, to save space, only the first several input variables are shown:

To prevent a future potential change in terms of a dataset link, we have a backup dataset located at the author's website, shown in the following Python code:

inFile="http://canisius.edu/~yany/data/bezdekIris.data.txt" 
import pandas as pd 
d=pd.read_csv(inFile,header=None) 

The following table shows several functions included in the pandas package that we could use to retrieve data:

Table 3.4 Functions included in the Python pandas module for inputting data

To find out detailed information on each of the preceding functions, we use the help() function. For example, if we want to get more information about the read_sas() function, we issue the following commands:

import pandas as pd 
help(pd.read_sas) 

The corresponding output, the top part only, is shown here:

主站蜘蛛池模板: 兰考县| 博爱县| 石首市| 香港| 广东省| 肥城市| 平定县| 大理市| 修武县| 且末县| 丽江市| 银川市| 铁岭县| 出国| 湖南省| 阳江市| 绥宁县| 调兵山市| 余庆县| 将乐县| 正蓝旗| 河北省| 莱芜市| 基隆市| 宕昌县| 望谟县| 塔城市| 彰化市| 霍城县| 文成县| 扎鲁特旗| 巴彦淖尔市| 含山县| 宁德市| 阜宁县| 拜泉县| 梅河口市| 佳木斯市| 抚州市| 禹城市| 鄂托克前旗|