官术网_书友最值得收藏!

Loading the dataset

We can again thank scikit-learn for easy access to the dataset. We first import all the necessary modules, as we did earlier:

In [1]: import numpy as np
... from sklearn import datasets
... from sklearn import metrics
... from sklearn import model_selection as modsel
... from sklearn import linear_model
... %matplotlib inline
... import matplotlib.pyplot as plt
... plt.style.use('ggplot')

Then, loading the dataset is a one-liner:

In [2]: boston = datasets.load_boston()

The structure of the boston object is identical to the iris object, as discussed in the preceding command. We can get more information about the dataset in 'DESCR', find all data in 'data', all feature names in 'feature_names', and all target values in 'target':

In [3]: dir(boston)
Out[3]: ['DESCR', 'data', 'feature_names', 'target']

The dataset contains a total of 506 data points, each of which has 13 features:

In [4]: boston.data.shape
Out[4]: (506, 13)

Of course, we have only a single target value, which is the housing price:

In [5]: boston.target.shape
Out[5]: (506,)
主站蜘蛛池模板: 瑞安市| 沅江市| 淅川县| 高淳县| 玉树县| 七台河市| 靖远县| 三明市| 桂林市| 成安县| 大余县| 太康县| 都江堰市| 大城县| 双桥区| 兴海县| 深州市| 绍兴县| 济阳县| 汝阳县| 靖西县| 金沙县| 临安市| 获嘉县| 萝北县| 巴林右旗| 抚远县| 凤城市| 太谷县| 安龙县| 富阳市| 中西区| 兴业县| 黑龙江省| 辉南县| 且末县| 罗江县| 邵阳县| 南皮县| 泰安市| 西吉县|