- Machine Learning Algorithms
- Giuseppe Bonaccorso
- 369字
- 2021-07-02 18:53:27
Underfitting and overfitting
The purpose of a machine learning model is to approximate an unknown function that associates input elements to output ones (for a classifier, we call them classes). However, a training set is normally a representation of a global distribution, but it cannot contain all possible elements; otherwise the problem could be solved with a one-to-one association. In the same way, we don't know the analytic expression of a possible underlying function, therefore, when training, it's necessary to think about fitting the model but keeping it free to generalize when an unknown input is presented. Unfortunately, this ideal condition is not always easy to find and it's important to consider two different dangers:
- Underfitting: It means that the model isn't able to capture the dynamics shown by the same training set (probably because its capacity is too limited).
- Overfitting: the model has an excessive capacity and it's not more able to generalize considering the original dynamics provided by the training set. It can associate almost perfectly all the known samples to the corresponding output values, but when an unknown input is presented, the corresponding prediction error can be very high.
In the following picture, there are examples of interpolation with low-capacity (underfitting), normal-capacity (normal fitting), and excessive capacity (overfitting):

It's very important to avoid both underfitting and overfitting. Underfitting is easier to detect considering the prediction error, while overfitting may prove to be more difficult to discover as it could be initially considered the result of a perfect fitting.
Cross-validation and other techniques that we're going to discuss in the next chapters can easily show how our model works with test samples never seen during the training phase. That way, it would be possible to assess the generalization ability in a broader context (remember that we're not working with all possible values, but always with a subset that should reflect the original distribution).
However, a generic rule of thumb says that a residual error is always necessary to guarantee a good generalization ability, while a model that shows a validation accuracy of 99.999... percent on training samples is almost surely overfitted and will likely be unable to predict correctly when never-seen input samples are provided.
- 多媒體CAI課件設(shè)計與制作導(dǎo)論(第二版)
- Learning C# by Developing Games with Unity 2020
- Access 2010數(shù)據(jù)庫基礎(chǔ)與應(yīng)用項目式教程(第3版)
- Visual Basic學(xué)習(xí)手冊
- 快速念咒:MySQL入門指南與進階實戰(zhàn)
- ExtJS高級程序設(shè)計
- ASP.NET程序開發(fā)范例寶典
- Creating Data Stories with Tableau Public
- Clean Code in C#
- Learning D
- Learning Image Processing with OpenCV
- Hands-On Dependency Injection in Go
- Deep Learning for Natural Language Processing
- Test-Driven iOS Development with Swift
- Learning RxJava