- Neural Networks with Keras Cookbook
- V Kishore Ayyadevara
- 215字
- 2021-07-02 12:46:30
Speeding up the training process using batch normalization
In the previous section on the scaling dataset, we learned that optimization is slow when the input data is not scaled (that is, it is not between zero and one).
The hidden layer value could be high in the following scenarios:
- Input data values are high
- Weight values are high
- The multiplication of weight and input are high
Any of these scenarios can result in a large output value on the hidden layer.
Note that the hidden layer is the input layer to output layer. Hence, the phenomenon of high input values resulting in a slow optimization holds true when hidden layer values are large as well.
Batch normalization comes to the rescue in this scenario. We have already learned that, when input values are high, we perform scaling to reduce the input values. Additionally, we have learned that scaling can also be performed using a different method, which is to subtract the mean of the input and divide it by the standard deviation of the input. Batch normalization performs this method of scaling.
Typically, all values are scaled using the following formula:




Notice that γ and β are learned during training, along with the original parameters of the network.
- 一步一步學(xué)Spring Boot 2:微服務(wù)項目實戰(zhàn)
- Python快樂編程:人工智能深度學(xué)習(xí)基礎(chǔ)
- Spring Boot開發(fā)與測試實戰(zhàn)
- Objective-C Memory Management Essentials
- Instant 960 Grid System
- Instant Typeahead.js
- RTC程序設(shè)計:實時音視頻權(quán)威指南
- aelf區(qū)塊鏈應(yīng)用架構(gòu)指南
- Python機器學(xué)習(xí)算法與實戰(zhàn)
- 量化金融R語言高級教程
- Java EE核心技術(shù)與應(yīng)用
- 速學(xué)Python:程序設(shè)計從入門到進階
- Service Mesh實戰(zhàn):基于Linkerd和Kubernetes的微服務(wù)實踐
- Buildbox 2.x Game Development
- Java Web從入門到精通(第3版)