- Intelligent Projects Using Python
- Santanu Pattanayak
- 352字
- 2021-07-02 14:10:44
Long short-term memory (LSTM) cells
The vanishing gradient problem is taken care of, to a great extent, by a modified version of RNNs, called long short-term memory (LSTM) cells. The architectural diagram of a long short-term memory cell is as follows:

LSTM introduces the cell state, Ct, in addition to the memory state, ht, that you already saw when learning about RNNs. The cell state is regulated by three gates: the forget gate, the update gate, and the output gate. The forget gate determines how much information to retain from the previous cell states, Ct-1, and its output is expressed as follows:

The output of the update gate is expressed as follows:

The potential new candidate cell state, , is expressed as follows:

Based on the previous cell state and the current potential cell state, the updated cell state output is given via the following:

Not all of the information of the cell state is passed on to the next step, and how much of the cell state should be released to the next step is determined by the output gate. The output of the output gate is given via the following:

Based on the current cell state and the output gate, the updated memory state passed on to the next step is given via the following:

Now comes the big question: How does LSTM avoid the vanishing gradient problem? The equivalent of in LSTM is given by
, which can be expressed in a product form as follows:

Now, the recurrence in the cell state units is given by the following:

From this, we get the following:
As a result, the gradient expression, , becomes the following:

As you can see, if we can keep the forget cell state near one, the gradient will flow almost unattenuated, and the LSTM will not suffer from the vanishing gradient problem.
Most of the text-processing applications that we will look at in this book will use the LSTM version of RNNs.
- 深入淺出SSD:固態存儲核心技術、原理與實戰
- The Applied AI and Natural Language Processing Workshop
- 3ds Max Speed Modeling for 3D Artists
- micro:bit魔法修煉之Mpython初體驗
- R Deep Learning Essentials
- 基于Apache Kylin構建大數據分析平臺
- 電腦橫機使用與維修
- 3D Printing Blueprints
- USB應用分析精粹:從設備硬件、固件到主機端程序設計
- 計算機組裝與維護(慕課版)
- 分布式存儲系統:核心技術、系統實現與Go項目實戰
- MicroPython Cookbook
- Arduino案例實戰(卷Ⅳ)
- ARM接口編程
- 計算機組裝與維護立體化教程(微課版)