官术网_书友最值得收藏!

Building the architecture 

Each neural network consists of three sets of layers—input, hidden, and output. There is always one input and one output layer. If the neural network is deep, it has multiple hidden layers:

The difference between an RNN and the standard feedforward network comes in the cyclical hidden states. As seen in the following diagram, recurrent neural networks use cyclical hidden states. This way, data propagates from one time step to another, making each one of these steps dependent on the previous:

A common practice is to unfold the preceding diagram for better and more fluent understanding. After rotating the illustration vertically and adding some notations and labels, based on the example we picked earlier (generating a new chapter based on The Hunger Games books), we end up with the following diagram:

This is an unfolded RNN with one hidden layer. The identically looking sets of (input + hidden RNN unit + output) are actually the different time steps (or cycles) in the RNN. For example, the combination of  + RNN +  illustrates what is happening at time step  . At each time step, these operations perform as follows:

  1. The network encodes the word at the current time step (for example, t-1) using any of the word embedding techniques and produces a vector  (The produced vector can be  or  depending on the specific time step)
  2. Then, , the encoded version of the input word I at time step t-1, is plugged into the RNN cell (located in the hidden layer). After several equations (not displayed here but happening inside the RNN cell), the cell produces an output  and a memory state . The memory state is the result of the input  and the previous value of that memory state . For the initial time step, one can assume that  is a zero vector
  3. Producing the actual word (volunteer) at time step t-1 happens after decoding the output  using a text corpus specified at the beginning of the training 
  4. Finally, the network moves multiple time steps forward until reaching the final step where it predicts the word

You can see how each one of {…, , …} holds information about all the previous inputs. This makes RNNs very special and really good at predicting the next unit in a sequence. Let's now see what mathematical equations sit behind the preceding operations.

Text corpus—an array of all words in the example vocabulary.
主站蜘蛛池模板: 玉门市| 宜宾市| 古交市| 洪雅县| 祁连县| 吉林市| 定南县| 夹江县| 陇西县| 延庆县| 古田县| 永定县| 滦平县| 滕州市| 靖江市| 崇阳县| 项城市| 金坛市| 广灵县| 阿巴嘎旗| 阜新| 沾益县| 嘉鱼县| 海城市| 米泉市| 茌平县| 新乐市| 平塘县| 休宁县| 伊川县| 平凉市| 马鞍山市| 庄浪县| 新宁县| 堆龙德庆县| 高阳县| 郧西县| 千阳县| 泰来县| 宁城县| 汤阴县|