官术网_书友最值得收藏!

ReLU

ReLU has become more popular in the recent years; we can find either its usage or one of its variants' usages in almost any modern architecture. It has a simple mathematical formulation:

f(x)=max(0,x)

In simple words, ReLU squashes any input that is negative to zero and leaves positive numbers as they are. We can visualize the ReLU function as follows:

Image source: http://datareview.info/article/eto-nuzhno-znat-klyuchevyie-rekomendatsii-po-glubokomu-obucheniyu-chast-2/

Some of the pros and cons of using ReLU are as follows:

  • It helps the optimizer in finding the right set of weights sooner. More technically it makes the convergence of stochastic gradient descent faster.
  • It is computationally inexpensive, as we are just thresholding and not calculating anything like we did for the sigmoid and tangent functions.
  • ReLU has one disadvantage; when a large gradient passes through it during the backward propagation, they often become non-responsive; these are called dead neutrons, which can be controlled by carefully choosing the learning rate. We will discuss how to choose learning rates when we discuss the different ways to adjust the learning rate in Chapter 4, Fundamentals of Machine Learning.
主站蜘蛛池模板: 泰和县| 高台县| 普定县| 富宁县| 衡东县| 涞水县| 玉林市| 鸡西市| 镇坪县| 九龙县| 玛沁县| 信丰县| 甘孜| 赣榆县| 景谷| 丽水市| 靖州| 郸城县| 达尔| 云梦县| 福泉市| 阿拉善左旗| 新竹市| 柳河县| 湘潭市| 阜阳市| 沐川县| 寻甸| 新巴尔虎右旗| 青神县| 金阳县| 精河县| 西吉县| 惠水县| 揭西县| 化州市| 慈利县| 汶上县| 林州市| 丰县| 来凤县|