- Effective Amazon Machine Learning
- Alexis Perrier
- 186字
- 2021-07-03 00:17:53
Regularization on linear models
The Stochastic Gradient Descent algorithm (SGD) finds the optimal weights {wi} of the model by minimizing the error between the true and the predicted values on the N training samples:

Where are the predicted values, ?i the real values to be predicted; we have N samples, and each sample has n dimensions.
Regularization consists of adding a term to the previous equation and to minimize the regularized error:

The parameter helps quantify the amount of regularization, while R(w) is the regularization term dependent on the regression coefficients.
There are two types of weight constraints usually considered:
- L2 regularization as the sum of the squares of the coefficients:

- L1 regularization as the sum of the absolute value of the coefficients:

The constraint on the coefficients introduced by the regularization term R(w) prevents the model from overfitting the training data. The coefficients become tied together by the regularization and can no longer be tightly leashed to the predictors. Each type of regularization has its characteristic and gives rise to different variations on the SGD algorithm, which we now introduce:
- 計算機組成原理與接口技術:基于MIPS架構實驗教程(第2版)
- 企業(yè)數字化創(chuàng)新引擎:企業(yè)級PaaS平臺HZERO
- 達夢數據庫編程指南
- InfluxDB原理與實戰(zhàn)
- Python數據分析、挖掘與可視化從入門到精通
- Python廣告數據挖掘與分析實戰(zhàn)
- MySQL基礎教程
- 算法與數據中臺:基于Google、Facebook與微博實踐
- 3D計算機視覺:原理、算法及應用
- 深度剖析Hadoop HDFS
- 基于Apache CXF構建SOA應用
- 網站數據庫技術
- 大數據架構商業(yè)之路:從業(yè)務需求到技術方案
- IPython Interactive Computing and Visualization Cookbook(Second Edition)
- 數據科學實戰(zhàn)指南