官术网_书友最值得收藏!

  • Python Deep Learning
  • Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
  • 521字
  • 2021-07-02 14:31:07

Introduction to deep learning

In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton published a milestone paper titled ImageNet Classification with Deep Convolutional Neural Networks https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf. The paper describes their use of neural networks to win the ImageNet competition of the same year, which we mentioned in Chapter 2Neural Networks. At the end of their paper, they wrote the following:

"It is notable that our network's performance degrades if a single convolutional layer is removed. For example, removing any of the middle layers results in a loss of about 2% for the top-1 performance of the network. So the depth really is important for achieving our results."

They clearly mention the importance of the number of hidden layers present in deep networks. Krizheysky, Sutskever, and Hilton talk about convolutional layers, which we will not discuss until Chapter 4, Computer Vision With Convolutional Networks, but the basic question remains: what do those hidden layers do?

A typical English saying is a picture is worth a thousand words. Let's use this approach to understand what deep learning is. We'll use images from the highly-cited paper Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations (https://ai.stanford.edu/~ang/papers/icml09-ConvolutionalDeepBeliefNetworks.pdf). In Proceedings of the International Conference on Machine Learning (ICML) (2009) by H. Lee, R. Grosse, R. Ranganath, and A. Ng, the authors train a neural network with pictures of different categories of either objects or animals. In the following screenshot, we can see how the different layers of the network learn different characteristics of the input data. In the first layer, the network learns to detect some small basic features such as lines and edges, which are common for all images in all categories:

The first layer weights (top) and the second layer weights (bottom) after training

But in the next layers, which we can see in the preceding screenshot, it combines those lines and edges to compose more complex features that are specific for each category. In the first row of the bottom-left image, we can see how the network can detect different features of human faces such as eyes, noses, and mouths. In the case of cars, these would be wheels, doors, and so on, as seen in the second image from the left in the following image. These features are abstract, that is, the network has learned the generic shape of a feature (such as a mouth or a nose) and can detect this feature in the input data, despite the variations it might have:

Columns 1-4 represent the second layer (top) and third layer (bottom) weights learned for a specific object category (class). Column 5 represents the weights learned for a mixture of four object categories (faces, cars, airplanes, and motobikes)

In the second row of the preceding image, we can see how, in the deeper layers, the network combines these features in even more complex ones, such as faces and whole cars. A strength of deep neural networks is that they can learn these high-level abstract representations by themselves, deducting them from the training data.

主站蜘蛛池模板: 康马县| 团风县| 布拖县| 固阳县| 寿阳县| 漳平市| 博客| 汉沽区| 屏南县| 辽源市| 内丘县| 都江堰市| 龙里县| 金山区| 东港市| 屯门区| 枣庄市| 达拉特旗| 雅江县| 大连市| 南充市| 尼勒克县| 岳阳市| 连云港市| 渭源县| 穆棱市| 赫章县| 左云县| 湖南省| 新宾| 盈江县| 建昌县| 霍州市| 松溪县| 保康县| 上林县| 江阴市| 文登市| 平度市| 龙泉市| 象山县|