官术网_书友最值得收藏!

Gradient descent

Up until now, we have covered the different kind of neurons based on the activation functions that are used. We have covered the ways to quantify inaccuracy in the output of a neuron using cost functions. Now, we need a mechanism to take that inaccuracy and remedy it.

The mechanism through which the network can learn to output values closer to the expected or desired output is called gradient descent. Gradient descent is a common approach in machine learning for finding the lowest cost possible.

To understand gradient descent, let's use the single neuron equation we have been using so far:

Here, the following applies:

  • x is the input
  • w is the weight of the input
  • b is the bias of the input

Gradient descent can be represented as follows:

Initially, the neuron starts by assigning random values for w and b. From that point onward, the neuron needs to adjust the values of w and b so that it lowers or decreases the error or cost (cross entropy).

Taking the derivative of the cross entropy (cost function) results in a step-by-step change in w and b in the direction of the lowest cost possible. In other words, gradient descent tries to find the finest line between the network output and expected output.

The weights are adjusted based on a parameter called the learning rate. The learning rate is the value that is adjusted to the weight of the neuron to get an output closer to the expected output.

Keep in mind that here, we have used only a single parameter; this is only to make things easier to comprehend. In reality, there are thousands upon millions of parameters that are taken into consideration to lower the cost.

主站蜘蛛池模板: 内江市| 荔浦县| 兰州市| 宜君县| 南江县| 民和| 鄂伦春自治旗| 元江| 公主岭市| 嘉定区| 探索| 榆中县| 云浮市| 南川市| 徐闻县| 滕州市| 隆德县| 英超| 金川县| 鄱阳县| 运城市| 临西县| 景泰县| 阆中市| 苏尼特右旗| 辉县市| 新宾| 湛江市| 廊坊市| 林州市| 天等县| 武城县| 兰西县| 攀枝花市| 娱乐| 大新县| 乌拉特中旗| 黄骅市| 贺州市| 盐津县| 密云县|