官术网_书友最值得收藏!

  • Deep Learning with PyTorch
  • Vishnu Subramanian
  • 182字
  • 2021-06-24 19:16:28

ReLU

ReLU has become more popular in the recent years; we can find either its usage or one of its variants' usages in almost any modern architecture. It has a simple mathematical formulation:

f(x)=max(0,x)

In simple words, ReLU squashes any input that is negative to zero and leaves positive numbers as they are. We can visualize the ReLU function as follows:

Image source: http://datareview.info/article/eto-nuzhno-znat-klyuchevyie-rekomendatsii-po-glubokomu-obucheniyu-chast-2/

Some of the pros and cons of using ReLU are as follows:

  • It helps the optimizer in finding the right set of weights sooner. More technically it makes the convergence of stochastic gradient descent faster.
  • It is computationally inexpensive, as we are just thresholding and not calculating anything like we did for the sigmoid and tangent functions.
  • ReLU has one disadvantage; when a large gradient passes through it during the backward propagation, they often become non-responsive; these are called dead neutrons, which can be controlled by carefully choosing the learning rate. We will discuss how to choose learning rates when we discuss the different ways to adjust the learning rate in Chapter 4, Fundamentals of Machine Learning.
主站蜘蛛池模板: 洞头县| 九龙县| 昭苏县| 亚东县| 汝城县| 大竹县| 深水埗区| 疏附县| 额济纳旗| 邵武市| 安阳县| 开封县| 新宾| 游戏| 容城县| 河东区| 松桃| 海原县| 咸丰县| 孟州市| 静安区| 奉贤区| 司法| 眉山市| 永昌县| 峨眉山市| 平江县| 南汇区| 区。| 小金县| 余干县| 泗水县| 如东县| 姚安县| 江西省| 清镇市| 肥乡县| 徐汇区| 呈贡县| 宝清县| 玛沁县|