官术网_书友最值得收藏!

Convolutional neural networks

CNNs, or ConvNets, are quite similar to regular neural networks. They are still made up of neurons with weights that can be learned from data. Each neuron receives some inputs and performs a dot product. They still have a loss function on the last fully connected layer. They can still use a nonlinearity function. All of the tips and techniques that we learned from the last chapter are still valid for CNN. As we saw in the previous chapter, a regular neural network receives input data as a single vector and passes through a series of hidden layers. Every hidden layer consists of a set of neurons, wherein every neuron is fully connected to all the other neurons in the previous layer. Within a single layer, each neuron is completely independent and they do not share any connections. The last fully connected layer, also called the output layer, contains class scores in the case of an image classification problem. Generally, there are three main layers in a simple ConvNet. They are the convolution layer, the pooling layer, and the fully connected layer. We can see a simple neural network in the following image:

A regular three-layer neural network

So, what changes? Since a CNN mostly takes images as input, this allows us to encode a few properties into the network, thus reducing the number of parameters.

In the case of real-world image data, CNNs perform better than Multi-Layer Perceptrons (MLPs). There are two reasons for this:

  • In the last chapter, we saw that in order to feed an image to an MLP, we convert the input matrix into a simple numeric vector with no spatial structure. It has no knowledge that these numbers are spatially arranged. So, CNNs are built for this very reason; that is, to elucidate the patterns in multidimensional data. Unlike MLPs, CNNs understand the fact that image pixels that are closer in proximity to each other are more heavily related than pixels that are further apart:
    CNN = Input layer + hidden layer + fully connected layer
  • CNNs differ from MLPs in the types of hidden layers that can be included in the model. A ConvNet arranges its neurons in three dimensions: width, height, and depth. Each layer transforms its 3D input volume into a 3D output volume of neurons using activation functions. For example, in the following figure, the red input layer holds the image. Thus its width and height are the dimensions of the image, and the depth is three since there are Red, Green, and Blue channels:
ConvNets are deep neural networks that share their parameters across space.
主站蜘蛛池模板: 鹿泉市| 江西省| 仙居县| 广宁县| 吉安县| 福泉市| 包头市| 郁南县| 平利县| 建湖县| 万州区| 鹤岗市| 石渠县| 哈密市| 灯塔市| 泉州市| 张家界市| 崇左市| 安西县| 澄江县| 聂荣县| 定结县| 绥阳县| 平塘县| 凭祥市| 宜丰县| 朝阳县| 蓝田县| 留坝县| 巨野县| 龙岩市| 盘锦市| 潜山县| 新宾| 墨脱县| 象州县| 周宁县| 兰坪| 扶绥县| 朔州市| 宜宾市|