官术网_书友最值得收藏!

Optimize the neural network 

We started with random weights to predict our targets and calculate loss for our algorithm. We calculate the gradients by calling the backward function on the final loss variable. This entire process repeats for one epoch, that is, for the entire set of examples. In most of the real-world examples, we will do the optimization step per iteration, which is a small subset of the total set. Once the loss is calculated, we optimize the values with the calculated gradients so that the loss reduces, which is implemented in the following function:

def optimize(learning_rate):
w.data -= learning_rate * w.grad.data
b.data -= learning_rate * b.grad.data

The learning rate is a hyper-parameter, which allows us to adjust the values in the variables by a small amount of the gradients, where the gradients denote the direction in which each variable (w and b) needs to be adjusted.

Different optimizers, such as Adam, RmsProp, and SGD are already implemented for use in the torch.optim package. We will be making use of these optimizers in later chapters to reduce the loss or improve the accuracy.

主站蜘蛛池模板: 开平市| 德清县| 铁岭市| 淄博市| 读书| 高州市| 江都市| 金秀| 辰溪县| 盐城市| 柯坪县| 当阳市| 牙克石市| 河曲县| 肇源县| 安乡县| 旬邑县| 临城县| 正阳县| 张掖市| 乐陵市| 尼玛县| 澳门| 沂水县| 游戏| 军事| 清原| 兴隆县| 方正县| 定远县| 万安县| 枝江市| 兴山县| 新兴县| 无锡市| 定南县| 南溪县| 河西区| 灵丘县| 赤城县| 庄河市|