書名： Machine Learning for OpenCV
作者名： Michael Beyeler
本章字數： 113字
更新時間： 2021-07-02 19:47:25

Splitting the data into training and test sets

We learned in the previous chapter that it is essential to keep training and test data separate. We can easily split the data using one of scikit-learn's many helper functions:

In [11]: X_train, X_test, y_train, y_test = model_selection.train_test_split(
...            data, target, test_size=0.1, random_state=42
...      )

Here we want to split the data into 90 percent training data and 10 percent test data, which we specify with test_size=0.1. By inspecting the return arguments, we note that we ended up with exactly 90 training data points and 10 test data points:

In [12]: X_train.shape, y_train.shape
Out[12]: ((90, 4), (90,))
In [13]: X_test.shape, y_test.shape
Out[13]: ((10, 4), (10,))

官术网_书友最值得收藏!

Machine Learning for OpenCV

Splitting the data into training and test sets