官术网_书友最值得收藏!

Optimize the neural network 

We started with random weights to predict our targets and calculate loss for our algorithm. We calculate the gradients by calling the backward function on the final loss variable. This entire process repeats for one epoch, that is, for the entire set of examples. In most of the real-world examples, we will do the optimization step per iteration, which is a small subset of the total set. Once the loss is calculated, we optimize the values with the calculated gradients so that the loss reduces, which is implemented in the following function:

def optimize(learning_rate):
w.data -= learning_rate * w.grad.data
b.data -= learning_rate * b.grad.data

The learning rate is a hyper-parameter, which allows us to adjust the values in the variables by a small amount of the gradients, where the gradients denote the direction in which each variable (w and b) needs to be adjusted.

Different optimizers, such as Adam, RmsProp, and SGD are already implemented for use in the torch.optim package. We will be making use of these optimizers in later chapters to reduce the loss or improve the accuracy.

主站蜘蛛池模板: 温州市| 郸城县| 衡东县| 稻城县| 滦南县| 三江| 合山市| 肥西县| 内丘县| 通海县| 大同市| 且末县| 进贤县| 普洱| 利辛县| 湘潭县| 会东县| 榆树市| 江陵县| 伽师县| 五原县| 永安市| 嵊泗县| 安丘市| 沂南县| 红桥区| 崇文区| 嫩江县| 安远县| 长丰县| 翁牛特旗| 高平市| 黄陵县| 台中市| 五峰| 乌鲁木齐市| 广宗县| 宽甸| 梁河县| 雷州市| 错那县|