- Deep Learning Essentials
- Wei Di Anurag Bhardwaj Jianing Wei
- 267字
- 2021-06-30 19:17:52
ReLU
The Rectified Linear Unit (ReLU) has become quite popular in recent years. Its mathematical formula is as follows:
Compared to sigmoid and tanh, its computation is much simpler and more efficient. It was proved that it improves convergence by six times (for example, a factor of 6 in Krizhevsky and it's co-authors in their work of ImageNet Classification with Deep Convolutional Neural Networks, 2012), possibly due to the fact that it has a linear and non-saturating form. Also, unlike tanh or sigmoid functions which involve the expensive exponential operation, ReLU can be achieved by simply thresholding activation at zero. Therefore, it has become very popular over the last couple of years. Almost all deep learning models use ReLU nowadays. Another important advantage of ReLU is that it avoids or rectifies the vanishing gradient problem.
Its limitation resides in the fact that its direct output is not in the probability space. It cannot be used in the output layer, but only in the hidden layers. Therefore, for classification problems, one needs to use the softmax function on the last layer to compute the probabilities for classes. For a regression problem, one should simply use a linear function. Another problem with ReLU is that it can cause dead neuron problems. For example, if large gradients flow through ReLU, it may cause the weights to be updated such that a neuron will never be active on any other future data points.
To fix this problem, another modification was introduced called Leaky ReLU. To fix the problem of dying neurons it introduces a small slope to keep the updates alive.
- 3D Printing with RepRap Cookbook
- 錯覺:AI 如何通過數(shù)據(jù)挖掘誤導(dǎo)我們
- 數(shù)據(jù)產(chǎn)品經(jīng)理:解決方案與案例分析
- Security Automation with Ansible 2
- 網(wǎng)絡(luò)綜合布線技術(shù)
- Photoshop CS3圖像處理融會貫通
- 計算機網(wǎng)絡(luò)原理與技術(shù)
- Python:Data Analytics and Visualization
- 氣動系統(tǒng)裝調(diào)與PLC控制
- 單片機C語言程序設(shè)計完全自學(xué)手冊
- 自動化生產(chǎn)線安裝與調(diào)試(三菱FX系列)(第二版)
- 在實戰(zhàn)中成長:C++開發(fā)之路
- 計算機硬件技術(shù)基礎(chǔ)(第2版)
- 歐姆龍CP1H型PLC編程與應(yīng)用
- 工業(yè)機器人編程指令詳解