- Hands-On Machine Learning with ML.NET
- Jarred Capellman
- 191字
- 2021-06-24 16:43:25
Obtaining a dataset
As you can imagine, one of the most important aspects of the model building process is obtaining a high-quality dataset. A dataset is used to train the model on what the output should be in the case of the aforementioned case of supervised learning. In the case of unsupervised learning, labeling is required for the dataset. A common misconception when creating a dataset is that bigger is better. This is far from the truth in a lot of cases. Continuing the preceding example, what if all of the poll results answered the same way for every single question? At that point, your dataset is composed of all the same data points and your model will not be able to properly predict any of the other candidates. This outcome is called overfitting. A diverse but representative dataset is required for machine learning algorithms to properly build a production-ready model.
In Chapter 11, Training and Building Production Models, we will deep dive into the methodology of obtaining quality datasets, looking at helpful resources, ways to manage your datasets, and transforming data, commonly referred to as data wrangling.
- Learning Single:page Web Application Development
- Learning LibGDX Game Development(Second Edition)
- 高手是如何做產品設計的(全2冊)
- Linux C/C++服務器開發實踐
- 構建移動網站與APP:HTML 5移動開發入門與實戰(跨平臺移動開發叢書)
- PHP程序設計(慕課版)
- Python GUI Programming Cookbook
- 深入淺出RxJS
- FFmpeg入門詳解:音視頻原理及應用
- Unity 5 for Android Essentials
- Java:High-Performance Apps with Java 9
- Oracle 18c 必須掌握的新特性:管理與實戰
- Angular開發入門與實戰
- 深入淺出Go語言編程
- Python入門很輕松(微課超值版)