官术网_书友最值得收藏!

  • Deep Learning with R for Beginners
  • Mark Hodnett Joshua F. Wiley Yuxi (Hayden) Liu Pablo Maldonado
  • 149字
  • 2021-06-24 14:30:43

L2 penalty

The L2 penalty, also known as ridge regression, is similar in many ways to the L1 penalty, but instead of adding a penalty based on the sum of the absolute weights, the penalty is based on the squared weights. This means that larger absolute weights are penalized more. In the context of neural networks, this is sometimes referred to as weight decay. If you examine the gradient of the regularized objective function, there is a penalty such that, at every update, there is a multiplicative penalty to the weights. As for the L1 penalty, although they could be included, biases or offsets are usually excluded from this.

From the perspective of a linear regression problem, the L2 penalty is a modification to the objective function minimized, from ∑(yi - ?i) to ∑(yi - ?iλΘ2.

主站蜘蛛池模板: 花莲县| 盈江县| 左贡县| 曲阳县| 唐山市| 天水市| 合阳县| 高台县| 安乡县| 镇沅| 上高县| 泸西县| 鱼台县| 河南省| 肇州县| 东兰县| 库尔勒市| 安平县| 余江县| 都兰县| 鸡西市| 安吉县| 广东省| 大荔县| 泽州县| 葫芦岛市| 栾川县| 凤凰县| 宣恩县| 韶关市| 伊川县| 永定县| 扶风县| 大厂| 同心县| 丹巴县| 湘阴县| 东平县| 屯留县| 永靖县| 邢台县|