官术网_书友最值得收藏!

Using momentum with gradient descent

Using gradient descent with momentum speeds up gradient descent by increasing the speed of learning in directions the gradient has been constant in direction while slowing learning in directions the gradient fluctuates in direction. It allows the velocity of gradient descent to increase. 

Momentum works by introducing a velocity term, and using a weighted moving average of that term in the update rule, as follows:

Most typically   is set to 0.9 in the case of momentum, and usually this is not a hyper-parameter that needs to be changed.

主站蜘蛛池模板: 涪陵区| 康马县| 保靖县| 什邡市| 丹巴县| 洪雅县| 常州市| 汾阳市| 盐池县| 罗江县| 嫩江县| 广灵县| 越西县| 孟津县| 泾川县| 湖南省| 汉沽区| 福州市| 霍城县| 荣成市| 玉树县| 北京市| 贵溪市| 汤原县| 广安市| 江油市| 崇义县| 顺平县| 雷山县| 金阳县| 宣武区| 彰武县| 东光县| 涿州市| 阿巴嘎旗| 广水市| 河池市| 桂阳县| 河池市| 汝阳县| 嘉兴市|