官术网_书友最值得收藏!

Gradient descent

Up until now, we have covered the different kind of neurons based on the activation functions that are used. We have covered the ways to quantify inaccuracy in the output of a neuron using cost functions. Now, we need a mechanism to take that inaccuracy and remedy it.

The mechanism through which the network can learn to output values closer to the expected or desired output is called gradient descent. Gradient descent is a common approach in machine learning for finding the lowest cost possible.

To understand gradient descent, let's use the single neuron equation we have been using so far:

Here, the following applies:

  • x is the input
  • w is the weight of the input
  • b is the bias of the input

Gradient descent can be represented as follows:

Initially, the neuron starts by assigning random values for w and b. From that point onward, the neuron needs to adjust the values of w and b so that it lowers or decreases the error or cost (cross entropy).

Taking the derivative of the cross entropy (cost function) results in a step-by-step change in w and b in the direction of the lowest cost possible. In other words, gradient descent tries to find the finest line between the network output and expected output.

The weights are adjusted based on a parameter called the learning rate. The learning rate is the value that is adjusted to the weight of the neuron to get an output closer to the expected output.

Keep in mind that here, we have used only a single parameter; this is only to make things easier to comprehend. In reality, there are thousands upon millions of parameters that are taken into consideration to lower the cost.

主站蜘蛛池模板: 丰城市| 昌都县| 丰镇市| 凤城市| 南投县| 龙海市| 奉节县| 凯里市| 金溪县| 普陀区| 桦甸市| 东乡族自治县| 五台县| 金秀| 米林县| 鱼台县| 萝北县| 加查县| 浙江省| 广汉市| 霍州市| 瑞金市| 松潘县| 磴口县| 永嘉县| 辉县市| 安多县| 枣强县| 古田县| 阳西县| 宁强县| 普兰店市| 卓资县| 乌拉特中旗| 和硕县| 白城市| 高邮市| 建始县| 仙桃市| 韶山市| 鲁山县|