官术网_书友最值得收藏!

Convolutional neural networks

Sight is hands-down the most-used sub-process. You are using it right now! Of course, it was something researchers attempted to mimic with neural networks early on, except that nothing really worked well until the concept of convolution was applied and used to classify images. The concept of convolution is the idea behind detecting, sometimes grouping, and isolating common features in an image. For instance, if you cover up 3/4 of a picture of a familiar object and show it to someone, they will almost certainly recognize the image by recognizing just the partial features. Convolution works the same way, by blowing up an image and then isolating the features for later recognition.

Convolution works by dissecting an image into its feature parts, which makes it easier to train a network. Let's jump into a code sample that extends from where we left off in the previous chapter but that now introduces convolution. Open up the Chapter_2_1.py listing and follow these steps:

  1. Take a look at the first couple of lines doing the import:
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K
  1. In this example, we import new layer types: Conv2D, MaxPooling2D, and UpSampling2D
  2. Then we set the Input and build up the encoded and decoded network sections using the following code:
input_img = Input(shape=(28, 28, 1)) # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
  1. The first thing to note is that we are now preserving the dimensions of the image, in this case, 28 x 28 pixels wide and 1 layer or channel. This example uses an image that is in grayscale, so there is only a single color channel. This is vastly different from before, when we just unraveled the image into a single 784-dimension vector.

The second thing to note is the use of the Conv2D layer or two-dimensional convolutional layer and the following MaxPooling2D or UpSampling2D layers. Pooling or sampling layers are used to gather or conversely unravel features. Note how we use pooling or down-sampling layers after convolution when the image is encoded and then up-sampling layers when decoding the image.

  1. Next, we build and train the model with the following block of code:
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

from tensorflow.keras.datasets import mnist
import numpy as np

(x_train, _), (x_test, _) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))

from tensorflow.keras.callbacks import TensorBoard

autoencoder.fit(x_train, x_train,
epochs=50,
batch_size=128,
shuffle=True,
validation_data=(x_test, x_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])

decoded_imgs = autoencoder.predict(x_test)
  1. The training of the model in the preceding code mirrors what we did at the end of the previous chapter, but note the selection of training and testing sets now. We no longer squish the image but rather preserve its spatial properties as inputs into the convolutional layer.
  2. Finally, we output the results with the following code:
n = 10
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
ax = plt.subplot(2, n, i + n)
plt.imshow(decoded_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
  1. Run the code, as you have before, and you'll immediately notice that it is about 100 times slower to train. This may or may not require you to wait, depending on your machine; if it does, go get a beverage or three and perhaps a meal. 

Training our simple sample now takes a large amount of time, which may be quite noticeable on older hardware. In the next section, we look at how we can start to monitor the training sessions, in great detail.

主站蜘蛛池模板: 浦江县| 桃江县| 呼玛县| 卢龙县| 交口县| 临湘市| 临夏县| 新乐市| 兴城市| 通辽市| 英吉沙县| 山丹县| 尉犁县| 西贡区| 黄大仙区| 酉阳| 沙田区| 祁阳县| 盐城市| 淳安县| 边坝县| 诸城市| 莎车县| 开江县| 澳门| 千阳县| 英吉沙县| 西峡县| 景东| 来安县| 夏河县| 平远县| 乌恰县| 灵寿县| 淄博市| 中宁县| 方正县| 大厂| 深圳市| 岑溪市| 大埔区|