- Python Deep Learning
- Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
- 484字
- 2021-07-02 14:31:08
A brief history of contemporary deep learning
In addition to the aforementioned models, the first edition of this book included networks such as Restricted Boltzmann Machines (RBMs) and DBNs. They were popularized by Geoffrey Hinton, a Canadian scientist, and one of the most prominent deep learning researchers. Back in 1986, he was also one of the inventors of backpropagation. RBMs are a special type of generative neural network, where the neurons are organized into two layers, namely, visible and hidden. Unlike feed-forward networks, the data in an RBM can flow in both directions – from visible to hidden units, and vice versa. In 2002, Prof. Hinton introduced contrastive divergence, which is an unsupervised algorithm for training RBMs. And in 2006, he introduced Deep Belief Nets, which are deep neural networks that are formed by stacking multiple RBMs. Thanks to their novel training algorithm, it was possible to create a DBN with more hidden layers than had previously been possible. To understand this, we should explain why it was so difficult to train deep neural networks prior to that. In the past, the activation function of choice was the logistic sigmoid, shown in the following chart:

We now know that, to train a neural network, we need to compute the derivative of the activation function (along with all the other derivatives). The sigmoid derivative has significant value in a very narrow interval, centered around 0 and converges towards 0 in all other cases. In networks with many layers, it's highly likely that the derivative would converge to 0, when propagated to the first layers of the network. Effectively, this means we cannot update the weights in these layers. This is a famous problem called vanishing gradients and (along with a few other issues), which prevents the training of deep networks. By stacking pre-trained RBMs, DBNs were able to alleviate (but not solve) this problem.
But training a DBN is not easy. Let's look at the following steps:
- First, we have to train each RBM with contrastive divergence, and gradually stack them on top of each other. This phase is called pre-training.
- In effect, pre-training serves as a sophisticated weight initialization algorithm for the next phase, called fine-tuning. With fine-tuning, we transform the DBN in a regular multi-layer perceptron and continue training it using supervised backpropagation, in the same way we saw in Chapter 2, Neural Networks.
However, thanks to some algorithmic advances, it's now possible to train deep networks using plain old backpropagation, thus effectively eliminating the pre-training phase. We will discuss these improvements in the coming sections, but for now, let's just say that they rendered DBNs and RBMs obsolete. DBNs and RBMs are, without a doubt, interesting from a research perspective, but are rarely used in practice anymore. Because of this, we will omit them from this edition.
- Design Principles for Process:driven Architectures Using Oracle BPM and SOA Suite 12c
- Microsoft Application Virtualization Cookbook
- PHP 從入門到項目實踐(超值版)
- Mastering QGIS
- Learning Data Mining with Python
- 看透JavaScript:原理、方法與實踐
- 小程序開發原理與實戰
- C語言程序設計學習指導與習題解答
- Elasticsearch Server(Third Edition)
- Visual C#.NET程序設計
- Learning Salesforce Einstein
- Teaching with Google Classroom
- 計算機應用基礎實踐教程
- Java程序員面試筆試寶典(第2版)
- Java程序設計案例教程