官术网_书友最值得收藏!

Gradient descent

Gradient descent is an optimization technique that utilizes the gradients computed from backpropagation to update the weights and biases, moving towards the goal of minimizing the loss. As shown in the following diagram, the cost (or loss) function is minimized by adjusting the weights, along the slope or gradient of the function:

For a simple perceptron, this cost function is linear, with respect to the weights. But for deep neural networks, the cost function is most often high-dimensional and non-linear. As gradient descent has to traverse paths along all of the different dimensions, it may be difficult to arrive at the global minimum in an acceptable time. To avoid this problem and train faster, neural networks normally employ stochastic gradient descent, which is explained next.

主站蜘蛛池模板: 前郭尔| 黑水县| 麦盖提县| 登封市| 明水县| 志丹县| 怀宁县| 宜黄县| 舞阳县| 怀化市| 文登市| 长泰县| 巩义市| 池州市| 六枝特区| 德昌县| 宝兴县| 沾益县| 通化市| 兴和县| 永新县| 绍兴县| 广东省| 雷山县| 毕节市| 明溪县| 鹤壁市| 贡山| 刚察县| 北安市| 商都县| 宁都县| 阳曲县| 噶尔县| 黔江区| 西充县| 连州市| 东方市| 文安县| 老河口市| 南川市|