官术网_书友最值得收藏!

NN building blocks

In the torch.nn package, you'll find tons of predefined classes providing you with the basic functionality blocks. All of them are designed with practice in mind (for example, they support minibatches, have sane default values, and the weights are properly initialized). All modules follow the convention of callable, which means that the instance of any class can act as a function when applied to its arguments. For example, the Linear class implements a feed-forward layer with optional bias:

>>> import torch.nn as nn
>>> l = nn.Linear(2, 5)
>>> v = torch.FloatTensor([1, 2])
>>> l(v)
tensor([ 0.1975,  0.1639,  1.1130, -0.2376, -0.7873])

Here, we created a randomly initialized feed-forward layer, with two inputs and five outputs, and applied it to our float tensor. All classes in the torch.nn packages inherit from the nn.Module base class, which you can use to implement your own higher-level NN blocks. We'll see how you can do this in the next section, but, for now, let's look at useful methods that all nn.Module children provide. They are as follows:

  • parameters(): A function that returns iterator of all variables which require gradient computation (that is, module weights)
  • zero_grad(): This function initializes all gradients of all parameters to zero
  • to(device): This moves all module parameters to a given device (CPU or GPU)
  • state_dict(): This returns the dictionary with all module parameters and is useful for model serialization
  • load_state_dict(): This initializes the module with the state dictionary

The whole list of available classes can be found in the documentation at http://pytorch.org/docs.

Now we should mention one very convenient class that allows you to combine other layers into the pipe: Sequential. The best way to demonstrate Sequential is through an example:

>>> s = nn.Sequential(
... nn.Linear(2, 5),
... nn.ReLU(),
... nn.Linear(5, 20),
... nn.ReLU(),
... nn.Linear(20, 10),
... nn.Dropout(p=0.3),
... nn.Softmax(dim=1))
>>> s
Sequential (
  (0): Linear (2 -> 5)
  (1): ReLU ()
  (2): Linear (5 -> 20)
  (3): ReLU ()
  (4): Linear (20 -> 10)
  (5): Dropout (p = 0.3)
  (6): Softmax ()
)

Here, we defined a three-layer NN with softmax on output, applied along dimension 1 (dimension 0 is batch samples), ReLU nonlinearities and dropout. Let's push something through it:

>>> s(torch.FloatTensor([[1,2]]))
tensor([[ 0.1410,  0.1380,  0.0591,  0.1091,  0.1395,  0.0635,  0.0607,
          0.1033,  0.1397,  0.0460]])

So, our minibatch is one example successfully traversed through the network!

主站蜘蛛池模板: 永和县| 龙陵县| 宣威市| 盐山县| 武穴市| 喜德县| 黄平县| 托克逊县| 连山| 大安市| 衡水市| 得荣县| 梨树县| 新沂市| 西吉县| 密山市| 田东县| 苏尼特左旗| 铁岭县| 尼玛县| 稻城县| 台南市| 定兴县| 固安县| 漾濞| 墨脱县| 涡阳县| 石楼县| 周口市| 赤城县| 上高县| 鄂托克前旗| 辽源市| 镇远县| 鹿泉市| 延长县| 晋江市| 泾川县| 南皮县| 湖州市| 西安市|