官术网_书友最值得收藏!

The gradient descent algorithm

The gradient descent algorithm is an optimization algorithm to find the minimum of the function using first order derivatives, that is, we differentiate functions with respect to their parameters to first order only. Here, the objective of the gradient descent algorithm would be to minimize the cost function with regards to and.

This approach includes following steps for numerous iterations to minimize :

used in the above equations refers to the learning rate. The learning rate is the speed at which the learning agent adapts to new knowledge. Thus, , that is, the learning rate is a hyperparameter that needs to be assigned as a scalar value or as a function of time. In this way, in every iteration, the values of and are updated as mentioned in the preceding formula until the value of the cost function reaches an acceptable minimum value.

The gradient descent algorithm means moving down the slope. The slope of the the curve is represented by the cost function with regards to the parameters. The gradient, that is, the slope, gives the direction of increasing slope if it's positive, and decreasing if it's negative. Thus, we use a negative sign to multiply with our slope since we have to go opposite to the direction of the increasing slope and toward the direction of the decreasing.

Using the optimum learning rate, , the descent is controlled and we don't overshoot the local minimum. If the learning rate, , is very small, then convergence will take more time, while if it's very high then it might overshoot and miss the minimum and diverge owing to the large number of iterations:

主站蜘蛛池模板: 绥芬河市| 济南市| 永丰县| 攀枝花市| 邢台市| 福贡县| 伊宁县| 海安县| 醴陵市| 彝良县| 商南县| 德钦县| 岳池县| 城口县| 清丰县| 蕲春县| 左云县| 鲁甸县| 海盐县| 福安市| 安达市| 土默特右旗| 赣州市| 德令哈市| 九江县| 灵台县| 四子王旗| 兖州市| 阆中市| 原阳县| 台前县| 武穴市| 延川县| 吉木乃县| 阿鲁科尔沁旗| 临潭县| 神池县| 南昌市| 镇赉县| 彭水| 彰化市|