官术网_书友最值得收藏!

Optimize the neural network 

We started with random weights to predict our targets and calculate loss for our algorithm. We calculate the gradients by calling the backward function on the final loss variable. This entire process repeats for one epoch, that is, for the entire set of examples. In most of the real-world examples, we will do the optimization step per iteration, which is a small subset of the total set. Once the loss is calculated, we optimize the values with the calculated gradients so that the loss reduces, which is implemented in the following function:

def optimize(learning_rate):
w.data -= learning_rate * w.grad.data
b.data -= learning_rate * b.grad.data

The learning rate is a hyper-parameter, which allows us to adjust the values in the variables by a small amount of the gradients, where the gradients denote the direction in which each variable (w and b) needs to be adjusted.

Different optimizers, such as Adam, RmsProp, and SGD are already implemented for use in the torch.optim package. We will be making use of these optimizers in later chapters to reduce the loss or improve the accuracy.

主站蜘蛛池模板: 清原| 吐鲁番市| 榆社县| 西乌珠穆沁旗| 凤城市| 图木舒克市| 高安市| 仁怀市| 邯郸市| 新河县| 衢州市| 建阳市| 化州市| 鹿泉市| 五寨县| 新余市| 吉木萨尔县| 兴国县| 临武县| 青神县| 红原县| 西峡县| 鄂尔多斯市| 化德县| 祁连县| 宕昌县| 韶关市| 邹平县| 深水埗区| 惠水县| 剑河县| 通化县| 盘锦市| 甘洛县| 绥中县| 淮滨县| 隆化县| 遵义县| 社旗县| 台东县| 金乡县|