- Python Deep Learning
- Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
- 484字
- 2021-07-02 14:31:08
A brief history of contemporary deep learning
In addition to the aforementioned models, the first edition of this book included networks such as Restricted Boltzmann Machines (RBMs) and DBNs. They were popularized by Geoffrey Hinton, a Canadian scientist, and one of the most prominent deep learning researchers. Back in 1986, he was also one of the inventors of backpropagation. RBMs are a special type of generative neural network, where the neurons are organized into two layers, namely, visible and hidden. Unlike feed-forward networks, the data in an RBM can flow in both directions – from visible to hidden units, and vice versa. In 2002, Prof. Hinton introduced contrastive divergence, which is an unsupervised algorithm for training RBMs. And in 2006, he introduced Deep Belief Nets, which are deep neural networks that are formed by stacking multiple RBMs. Thanks to their novel training algorithm, it was possible to create a DBN with more hidden layers than had previously been possible. To understand this, we should explain why it was so difficult to train deep neural networks prior to that. In the past, the activation function of choice was the logistic sigmoid, shown in the following chart:

We now know that, to train a neural network, we need to compute the derivative of the activation function (along with all the other derivatives). The sigmoid derivative has significant value in a very narrow interval, centered around 0 and converges towards 0 in all other cases. In networks with many layers, it's highly likely that the derivative would converge to 0, when propagated to the first layers of the network. Effectively, this means we cannot update the weights in these layers. This is a famous problem called vanishing gradients and (along with a few other issues), which prevents the training of deep networks. By stacking pre-trained RBMs, DBNs were able to alleviate (but not solve) this problem.
But training a DBN is not easy. Let's look at the following steps:
- First, we have to train each RBM with contrastive divergence, and gradually stack them on top of each other. This phase is called pre-training.
- In effect, pre-training serves as a sophisticated weight initialization algorithm for the next phase, called fine-tuning. With fine-tuning, we transform the DBN in a regular multi-layer perceptron and continue training it using supervised backpropagation, in the same way we saw in Chapter 2, Neural Networks.
However, thanks to some algorithmic advances, it's now possible to train deep networks using plain old backpropagation, thus effectively eliminating the pre-training phase. We will discuss these improvements in the coming sections, but for now, let's just say that they rendered DBNs and RBMs obsolete. DBNs and RBMs are, without a doubt, interesting from a research perspective, but are rarely used in practice anymore. Because of this, we will omit them from this edition.
- Mastering Visual Studio 2017
- 軟件項(xiàng)目估算
- Mastering RabbitMQ
- Hands-On Image Processing with Python
- Web開發(fā)的貴族:ASP.NET 3.5+SQL Server 2008
- GitLab Repository Management
- Web Application Development with MEAN
- Bootstrap Essentials
- Python機(jī)器學(xué)習(xí)基礎(chǔ)教程
- C語言程序設(shè)計(jì)
- Bootstrap 4 Cookbook
- Python全棧數(shù)據(jù)工程師養(yǎng)成攻略(視頻講解版)
- 監(jiān)控的藝術(shù):云原生時(shí)代的監(jiān)控框架
- Photoshop智能手機(jī)APP界面設(shè)計(jì)
- SQL Server 2012 數(shù)據(jù)庫應(yīng)用教程(第3版)