- Deep Learning with PyTorch
- Vishnu Subramanian
- 182字
- 2021-06-24 19:16:28
ReLU
ReLU has become more popular in the recent years; we can find either its usage or one of its variants' usages in almost any modern architecture. It has a simple mathematical formulation:
f(x)=max(0,x)
In simple words, ReLU squashes any input that is negative to zero and leaves positive numbers as they are. We can visualize the ReLU function as follows:

Image source: http://datareview.info/article/eto-nuzhno-znat-klyuchevyie-rekomendatsii-po-glubokomu-obucheniyu-chast-2/
Some of the pros and cons of using ReLU are as follows:
- It helps the optimizer in finding the right set of weights sooner. More technically it makes the convergence of stochastic gradient descent faster.
- It is computationally inexpensive, as we are just thresholding and not calculating anything like we did for the sigmoid and tangent functions.
- ReLU has one disadvantage; when a large gradient passes through it during the backward propagation, they often become non-responsive; these are called dead neutrons, which can be controlled by carefully choosing the learning rate. We will discuss how to choose learning rates when we discuss the different ways to adjust the learning rate in Chapter 4, Fundamentals of Machine Learning.
推薦閱讀
- Arduino入門基礎教程
- 辦公通信設備維修
- 現代辦公設備使用與維護
- 深入淺出SSD:固態存儲核心技術、原理與實戰(第2版)
- Mastering Manga Studio 5
- 計算機維修與維護技術速成
- 微軟互聯網信息服務(IIS)最佳實踐 (微軟技術開發者叢書)
- 無蘋果不生活:OS X Mountain Lion 隨身寶典
- 單片機原理及應用
- 筆記本電腦維修技能實訓
- UML精粹:標準對象建模語言簡明指南(第3版)
- Advanced Machine Learning with R
- The Reinforcement Learning Workshop
- 從企業級開發到云原生微服務:Spring Boot實戰
- 零基礎輕松學修電腦主板