官术网_书友最值得收藏!

L2 penalty

The L2 penalty, also known as ridge regression, is similar in many ways to the L1 penalty, but instead of adding a penalty based on the sum of the absolute weights, the penalty is based on the squared weights. This means that larger absolute weights are penalized more. In the context of neural networks, this is sometimes referred to as weight decay. If you examine the gradient of the regularized objective function, there is a penalty such that, at every update, there is a multiplicative penalty to the weights. As for the L1 penalty, although they could be included, biases or offsets are usually excluded from this.

From the perspective of a linear regression problem, the L2 penalty is a modification to the objective function minimized, from ∑(yi - ?i) to ∑(yi - ?iλΘ2.

主站蜘蛛池模板: 梨树县| 墨竹工卡县| 红安县| 清镇市| 东源县| 建瓯市| 喀喇| 安乡县| 磐石市| 广饶县| 同心县| 兰西县| 聂荣县| 甘孜| 镶黄旗| 邹城市| 陇南市| 香港| 江津市| 故城县| 甘孜县| 皋兰县| 阿克陶县| 章丘市| 通山县| 如皋市| 大丰市| 乾安县| 崇文区| 元江| 贵德县| 南澳县| 滨州市| 方正县| 平山县| 余干县| 集安市| 青浦区| 伊吾县| 咸宁市| 宁夏|