官术网_书友最值得收藏!

The hyperbolic tangent activation function

The output, y, of a hyperbolic tangent activation function (tanh) as a function of its total input, x, is given as follows:

The tanh activation function outputs values in the range [-11], as you can see in the following graph:

Figure 1.7: Tanh activation function

One thing to note is that both the sigmoid and the tanh activation functions are linear within a small range of the input, beyond which the output saturates. In the saturation zone, the gradients of the activation functions (with respect to the input) are very small or close to zero; this means that they are very prone to the vanishing gradient problem. As you will see later on, neural networks learn from the backpropagation method, where the gradient of a layer is dependent on the gradients of the activation units in the succeeding layers, up to the final output layer. Therefore, if the units in the activation units are working in the saturation region, much less of the error is backpropagated to the early layers of the neural network. Neural networks minimize the prediction error in order to learn the weights and biases (W) by utilizing the gradients. This means that, if the gradients are small or vanish to zero, then the neural network will fail to learn these weights properly.

主站蜘蛛池模板: 邯郸市| 汝城县| 武义县| 怀集县| 图们市| 梁山县| 平舆县| 和田市| 上虞市| 宣汉县| 佛学| 土默特右旗| 沙雅县| 历史| 昌平区| 五大连池市| 德兴市| 云梦县| 靖远县| 泾源县| 彭山县| 昌都县| 罗甸县| 屯门区| 武宣县| 福鼎市| 济阳县| 柳河县| 肥东县| 精河县| 丹凤县| 辰溪县| 景宁| 招远市| 乡城县| 平乐县| 河西区| 上饶市| 彭山县| 吴江市| 武隆县|