官术网_书友最值得收藏!

Loading data

We can load the data used in this chapter with the following function.  It's very similar to the function we used in chapter 2, however it's adapted for this dataset.

from sklearn.preprocessing import StandardScaler

def
load_data():
"""Loads train, val, and test datasets from disk"""
train = pd.read_csv(TRAIN_DATA)
val = pd.read_csv(VAL_DATA)
test = pd.read_csv(TEST_DATA)

# we will use a dict to keep all this data tidy.
data = dict()
data["train_y"] = train.pop('y')
data["val_y"] = val.pop('y')
data["test_y"] = test.pop('y')

# we will use sklearn's StandardScaler to scale our data to 0 mean, unit variance.
scaler = StandardScaler()
train = scaler.fit_transform(train)
val = scaler.transform(val)
test = scaler.transform(test)

data["train_X"] = train
data["val_X"] = val
data["test_X"] = test
# it's a good idea to keep the scaler (or at least the mean/variance) so we can unscale predictions
data["scaler"] = scaler
return data
主站蜘蛛池模板: 霍城县| 湛江市| 聂拉木县| 长沙县| 开封县| 遂川县| 莱芜市| 南开区| 建瓯市| 临西县| 五家渠市| 阳江市| 五莲县| 疏勒县| 西乡县| 象山县| 烟台市| 平山县| 南雄市| 海林市| 延庆县| 绵阳市| 盱眙县| 自治县| 佛学| 龙岩市| 武隆县| 顺平县| 瓮安县| 开鲁县| 全南县| 芜湖县| 荔波县| 渭南市| 满城县| 迁西县| 西丰县| 台前县| 小金县| 大方县| 江北区|