官术网_书友最值得收藏!

Building the architecture 

Each neural network consists of three sets of layers—input, hidden, and output. There is always one input and one output layer. If the neural network is deep, it has multiple hidden layers:

The difference between an RNN and the standard feedforward network comes in the cyclical hidden states. As seen in the following diagram, recurrent neural networks use cyclical hidden states. This way, data propagates from one time step to another, making each one of these steps dependent on the previous:

A common practice is to unfold the preceding diagram for better and more fluent understanding. After rotating the illustration vertically and adding some notations and labels, based on the example we picked earlier (generating a new chapter based on The Hunger Games books), we end up with the following diagram:

This is an unfolded RNN with one hidden layer. The identically looking sets of (input + hidden RNN unit + output) are actually the different time steps (or cycles) in the RNN. For example, the combination of  + RNN +  illustrates what is happening at time step  . At each time step, these operations perform as follows:

  1. The network encodes the word at the current time step (for example, t-1) using any of the word embedding techniques and produces a vector  (The produced vector can be  or  depending on the specific time step)
  2. Then, , the encoded version of the input word I at time step t-1, is plugged into the RNN cell (located in the hidden layer). After several equations (not displayed here but happening inside the RNN cell), the cell produces an output  and a memory state . The memory state is the result of the input  and the previous value of that memory state . For the initial time step, one can assume that  is a zero vector
  3. Producing the actual word (volunteer) at time step t-1 happens after decoding the output  using a text corpus specified at the beginning of the training 
  4. Finally, the network moves multiple time steps forward until reaching the final step where it predicts the word

You can see how each one of {…, , …} holds information about all the previous inputs. This makes RNNs very special and really good at predicting the next unit in a sequence. Let's now see what mathematical equations sit behind the preceding operations.

Text corpus—an array of all words in the example vocabulary.
主站蜘蛛池模板: 全州县| 新田县| 崇明县| 萍乡市| 静安区| 河西区| 清水河县| 喀什市| 云梦县| 绿春县| 临安市| 石泉县| 西林县| 邵武市| 苗栗县| 江陵县| 玉树县| 武平县| 万源市| 乌拉特前旗| 剑川县| 伊宁县| 政和县| 宿松县| 屏东市| 洛宁县| 达州市| 马山县| 饶平县| 博湖县| 宜宾市| 玉林市| 南乐县| 河南省| 仙居县| 北安市| 津南区| 永仁县| 宿州市| 达日县| 年辖:市辖区|