官术网_书友最值得收藏!

NN building blocks

In the torch.nn package, you'll find tons of predefined classes providing you with the basic functionality blocks. All of them are designed with practice in mind (for example, they support minibatches, have sane default values, and the weights are properly initialized). All modules follow the convention of callable, which means that the instance of any class can act as a function when applied to its arguments. For example, the Linear class implements a feed-forward layer with optional bias:

>>> import torch.nn as nn
>>> l = nn.Linear(2, 5)
>>> v = torch.FloatTensor([1, 2])
>>> l(v)
tensor([ 0.1975,  0.1639,  1.1130, -0.2376, -0.7873])

Here, we created a randomly initialized feed-forward layer, with two inputs and five outputs, and applied it to our float tensor. All classes in the torch.nn packages inherit from the nn.Module base class, which you can use to implement your own higher-level NN blocks. We'll see how you can do this in the next section, but, for now, let's look at useful methods that all nn.Module children provide. They are as follows:

  • parameters(): A function that returns iterator of all variables which require gradient computation (that is, module weights)
  • zero_grad(): This function initializes all gradients of all parameters to zero
  • to(device): This moves all module parameters to a given device (CPU or GPU)
  • state_dict(): This returns the dictionary with all module parameters and is useful for model serialization
  • load_state_dict(): This initializes the module with the state dictionary

The whole list of available classes can be found in the documentation at http://pytorch.org/docs.

Now we should mention one very convenient class that allows you to combine other layers into the pipe: Sequential. The best way to demonstrate Sequential is through an example:

>>> s = nn.Sequential(
... nn.Linear(2, 5),
... nn.ReLU(),
... nn.Linear(5, 20),
... nn.ReLU(),
... nn.Linear(20, 10),
... nn.Dropout(p=0.3),
... nn.Softmax(dim=1))
>>> s
Sequential (
  (0): Linear (2 -> 5)
  (1): ReLU ()
  (2): Linear (5 -> 20)
  (3): ReLU ()
  (4): Linear (20 -> 10)
  (5): Dropout (p = 0.3)
  (6): Softmax ()
)

Here, we defined a three-layer NN with softmax on output, applied along dimension 1 (dimension 0 is batch samples), ReLU nonlinearities and dropout. Let's push something through it:

>>> s(torch.FloatTensor([[1,2]]))
tensor([[ 0.1410,  0.1380,  0.0591,  0.1091,  0.1395,  0.0635,  0.0607,
          0.1033,  0.1397,  0.0460]])

So, our minibatch is one example successfully traversed through the network!

主站蜘蛛池模板: 泸州市| 乌拉特中旗| 潞城市| 玛曲县| 伊川县| 永川市| 华坪县| 稻城县| 全州县| 南阳市| 武川县| 古丈县| 秦皇岛市| 绿春县| 云和县| 教育| 永丰县| 清水县| 舒兰市| 苗栗市| 西平县| 万荣县| 乐清市| 永顺县| 儋州市| 合水县| 富阳市| 九江县| 亚东县| 延边| 武义县| 河南省| 鞍山市| 香港 | 内乡县| 福贡县| 缙云县| 揭阳市| 错那县| 南投县| 福海县|