- Mastering Machine Learning with R(Second Edition)
- Cory Lesmeister
- 316字
- 2021-07-09 18:23:59
Regularization in a nutshell
You may recall that our linear model follows the form, Y = B0 + B1x1 +...Bnxn + e, and also that the best fit tries to minimize the RSS, which is the sum of the squared errors of the actual minus the estimate, or e12 + e22 + ... en2.
With regularization, we will apply what is known as shrinkage penalty in conjunction with the minimization RSS. This penalty consists of a lambda (symbol λ), along with the normalization of the beta coefficients and weights. How these weights are normalized differs in the techniques, and we will discuss them accordingly. Quite simply, in our model, we are minimizing (RSS + λ(normalized coefficients)). We will select λ, which is known as the tuning parameter, in our model building process. Please note that if lambda is equal to 0, then our model is equivalent to OLS, as it cancels out the normalization term.
So what does this do for us and why does it work? First of all, regularization methods are very computationally efficient. In best subsets, we are searching 2p models, and in large datasets, it may not be feasible to attempt. In R, we are only fitting one model to each value of lambda and this is far more efficient. Another reason goes back to our bias-variance trade-off, which was discussed in the preface. In the linear model, where the relationship between the response and the predictors is close to linear, the least squares estimates will have low bias but may have high variance. This means that a small change in the training data can cause a large change in the least squares coefficient estimates (James, 2013). Regularization through the proper selection of lambda and normalization may help you improve the model fit by optimizing the bias-variance trade-off. Finally, regularization of coefficients works to solve multi collinearity problems.
- 數(shù)據(jù)要素安全流通
- MySQL數(shù)據(jù)庫進(jìn)階實(shí)戰(zhàn)
- 數(shù)據(jù)庫應(yīng)用實(shí)戰(zhàn)
- SQL查詢:從入門到實(shí)踐(第4版)
- Learning JavaScriptMVC
- 數(shù)據(jù)庫應(yīng)用基礎(chǔ)教程(Visual FoxPro 9.0)
- Python數(shù)據(jù)分析:基于Plotly的動(dòng)態(tài)可視化繪圖
- Python金融數(shù)據(jù)分析(原書第2版)
- 數(shù)據(jù)庫技術(shù)實(shí)用教程
- Instant Autodesk AutoCAD 2014 Customization with .NET
- SAS金融數(shù)據(jù)挖掘與建模:系統(tǒng)方法與案例解析
- Oracle數(shù)據(jù)庫管理、開發(fā)與實(shí)踐
- Oracle高性能SQL引擎剖析:SQL優(yōu)化與調(diào)優(yōu)機(jī)制詳解
- 菜鳥學(xué)SPSS數(shù)據(jù)分析
- Spring MVC Beginner’s Guide