- Machine Learning Algorithms
- Giuseppe Bonaccorso
- 224字
- 2021-07-02 18:53:29
scikit-learn toy datasets
scikit-learn provides some built-in datasets that can be used for testing purposes. They're all available in the package sklearn.datasets and have a common structure: the data instance variable contains the whole input set X while target contains the labels for classification or target values for regression. For example, considering the Boston house pricing dataset (used for regression), we have:
from sklearn.datasets import load_boston
>>> boston = load_boston()
>>> X = boston.data
>>> Y = boston.target
>>> X.shape
(506, 13)
>>> Y.shape
(506,)
In this case, we have 506 samples with 13 features and a single target value. In this book, we're going to use it for regressions and the MNIST handwritten digit dataset (load_digits()) for classification tasks. scikit-learn also provides functions for creating dummy datasets from scratch: make_classification(), make_regression(), and make_blobs() (particularly useful for testing cluster algorithms). They're very easy to use and in many cases, it's the best choice to test a model without loading more complex datasets.
- HornetQ Messaging Developer’s Guide
- C及C++程序設(shè)計(jì)(第4版)
- C程序設(shè)計(jì)簡明教程(第二版)
- Flutter開發(fā)實(shí)戰(zhàn)詳解
- C++ 從入門到項(xiàng)目實(shí)踐(超值版)
- AIRIOT物聯(lián)網(wǎng)平臺(tái)開發(fā)框架應(yīng)用與實(shí)戰(zhàn)
- Mastering C++ Multithreading
- Qt5 C++ GUI Programming Cookbook
- Beginning C++ Game Programming
- Modern C++ Programming Cookbook
- 區(qū)塊鏈國產(chǎn)化實(shí)踐指南:基于Fabric 2.0
- Python Projects for Kids
- Java Script從入門到精通(第5版)
- Puppet Cookbook(Third Edition)
- Cloud Development andDeployment with CloudBees