- Deep Learning Essentials
- Wei Di Anurag Bhardwaj Jianing Wei
- 131字
- 2021-06-30 19:17:53
Choosing the right activation function
In most cases, we should always consider ReLU first. But keep in mind that ReLU should only be applied to hidden layers. If your model suffers from dead neurons, then think about adjusting your learning rate, or try Leaky ReLU or maxout.
It is not recommended to use either sigmoid or tanh as they suffer from the vanishing gradient problem and also converge very slowly. Take sigmoid for example. Its derivative is greater than 0.25 everywhere, making terms during backpropagating even smaller. While for ReLU, its derivative is one at every point above zero, thus creating a more stable network.
Now you have gained a basic knowledge of the key components in neural networks, let's move on to understanding how the networks learn from data.
推薦閱讀
- 大數(shù)據(jù)技術(shù)與應(yīng)用基礎(chǔ)
- 我的J2EE成功之路
- 空間機(jī)器人遙操作系統(tǒng)及控制
- Java開發(fā)技術(shù)全程指南
- 圖解PLC控制系統(tǒng)梯形圖和語(yǔ)句表
- STM32G4入門與電機(jī)控制實(shí)戰(zhàn):基于X-CUBE-MCSDK的無(wú)刷直流電機(jī)與永磁同步電機(jī)控制實(shí)現(xiàn)
- Windows游戲程序設(shè)計(jì)基礎(chǔ)
- 樂高機(jī)器人—槍械武器庫(kù)
- 變頻器、軟啟動(dòng)器及PLC實(shí)用技術(shù)260問
- 大數(shù)據(jù)時(shí)代
- Applied Data Visualization with R and ggplot2
- 實(shí)用網(wǎng)絡(luò)流量分析技術(shù)
- 會(huì)聲會(huì)影X4中文版從入門到精通
- Hands-On SAS for Data Analysis
- 新一代人工智能與語(yǔ)音識(shí)別