- Intelligent Projects Using Python
- Santanu Pattanayak
- 188字
- 2021-07-02 14:10:51
The optimizer and initial learning rate
The Adam optimizer (adaptive moment estimator) is used in training that implements an advanced version of stochastic gradient descent. The Adam optimizer takes care of the curvature in the cost function, and at the same time, it uses momentum to ensure steady progress toward a good local minima. For the problem at hand, since we are using transfer learning and want to use as many of the previously learned features from the pre-trained network as possible, we will use a small initial learning rate of 0.00001. This will ensure that the network doesn't lose the useful features learned by the pre-trained networks, and fine-tunes to an optimal point less aggressively, based on the new data for the problem at hand. The Adam optimizer can be defined as follows:
adam = optimizers.Adam(lr=0.00001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
The beta_1 parameter controls the contribution of the current gradient in the momentum computation, whereas the beta_2 parameter controls the contribution of the square of the gradient in the gradient normalization, which helps to tackle the curvature in the cost function.
- 深入理解Spring Cloud與實(shí)戰(zhàn)
- FPGA從入門到精通(實(shí)戰(zhàn)篇)
- 電腦軟硬件維修大全(實(shí)例精華版)
- 硬件產(chǎn)品經(jīng)理成長(zhǎng)手記(全彩)
- The Applied AI and Natural Language Processing Workshop
- 數(shù)字邏輯(第3版)
- 深入淺出SSD:固態(tài)存儲(chǔ)核心技術(shù)、原理與實(shí)戰(zhàn)(第2版)
- Neural Network Programming with Java(Second Edition)
- 基于PROTEUS的電路設(shè)計(jì)、仿真與制板
- 單片機(jī)技術(shù)及應(yīng)用
- Wireframing Essentials
- Hands-On Deep Learning for Images with TensorFlow
- STM32自學(xué)筆記
- 微控制器的應(yīng)用
- The Reinforcement Learning Workshop