- Deep Learning Quick Reference
- Mike Bernico
- 231字
- 2021-06-24 18:40:05
Stochastic and minibatch gradient descents
The algorithm describe in the previous section assumes a forward and corresponding backwards pass over the entire dataset and as such it's called batch gradient descent.
Another possible way to do gradient descent would be to use a single data point at a time, updating the network weights as we go. This method might help speed up convergence around saddle points where the network might stop converging. Of course, the error estimation of only a single point may not be a very good approximation of the error of the entire dataset.
The best solution to this problem is using mini batch gradient descent, in which we will take some random subset of the data called a mini batch to compute our error and update our network weights. This is almost always the best option. It has the additional benefit of naturally splitting a very large dataset into chunks that are more easily managed in the memory of a machine, or even across machines.
- Splunk 7 Essentials(Third Edition)
- Introduction to DevOps with Kubernetes
- R Data Mining
- Seven NoSQL Databases in a Week
- Spark編程基礎(Scala版)
- 基于32位ColdFire構建嵌入式系統(tǒng)
- 新編計算機組裝與維修
- PVCBOT機器人控制技術入門
- 空間站多臂機器人運動控制研究
- WOW!Photoshop CS6完全自學寶典
- 筆記本電腦維修之電路分析基礎
- 電腦故障排除與維護終極技巧金典
- Red Hat Enterprise Linux 5.0服務器構建與故障排除
- Getting Started with Tableau 2019.2
- 嵌入式系統(tǒng)應用開發(fā)基礎