- Deep Learning Essentials
- Wei Di Anurag Bhardwaj Jianing Wei
- 131字
- 2021-06-30 19:17:53
Choosing the right activation function
In most cases, we should always consider ReLU first. But keep in mind that ReLU should only be applied to hidden layers. If your model suffers from dead neurons, then think about adjusting your learning rate, or try Leaky ReLU or maxout.
It is not recommended to use either sigmoid or tanh as they suffer from the vanishing gradient problem and also converge very slowly. Take sigmoid for example. Its derivative is greater than 0.25 everywhere, making terms during backpropagating even smaller. While for ReLU, its derivative is one at every point above zero, thus creating a more stable network.
Now you have gained a basic knowledge of the key components in neural networks, let's move on to understanding how the networks learn from data.
推薦閱讀
- ETL with Azure Cookbook
- 計(jì)算機(jī)控制技術(shù)
- JMAG電機(jī)電磁仿真分析與實(shí)例解析
- 嵌入式Linux上的C語(yǔ)言編程實(shí)踐
- 大數(shù)據(jù)安全與隱私保護(hù)
- OpenStack Cloud Computing Cookbook(Second Edition)
- 永磁同步電動(dòng)機(jī)變頻調(diào)速系統(tǒng)及其控制(第2版)
- 計(jì)算機(jī)網(wǎng)絡(luò)安全
- 愛(ài)犯錯(cuò)的智能體
- 多媒體制作與應(yīng)用
- 液壓機(jī)智能故障診斷方法集成技術(shù)
- INSTANT Munin Plugin Starter
- Building Google Cloud Platform Solutions
- Microsoft Dynamics CRM 2013 Marketing Automation
- 網(wǎng)管員世界2009超值精華本