- Python Deep Learning
- Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
- 1024字
- 2021-07-02 14:31:10
Using Keras to classify handwritten digits
In this section, we'll use Keras to classify the images of the MNIST dataset. It's comprised of 70,000 examples of handwritten digits by different people. The first 60,000 are typically used for training and the remaining 10,000 for testing:

One of the advantages of Keras is that it can import this dataset for you without needing to explicitly download it from the web (it will download it for you):
- Our first step will be to download the datasets using Keras:
from keras.datasets import mnist
- Then, we need to import a few classes to use a feed-forward network:
from keras.models import Sequential from keras.layers.core import Dense, Activation from keras.utils import np_utils
- Next, we'll load the training and testing data. (X_train, Y_train) are the training images and labels, and (X_test, Y_test) are the test images and labels:
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
- We need to modify the data to be able to use it. X_train contains 60,000 28 x 28 pixel images, and X_test contains 10,000. To feed them to the network as inputs, we want to reshape each sample as a 784-pixel long array, rather than a (28,28) two-dimensional matrix. We can accomplish this with these two lines:
X_train = X_train.reshape(60000, 784) X_test = X_test.reshape(10000, 784)
- The labels indicate the value of the digit depicted in the images. We want to convert this into a 10-entry one-hot encoded vector comprised of zeroes and just one 1 in the entry corresponding to the digit. For example, 4 is mapped to [0, 0, 0, 0, 1, 0, 0, 0, 0, 0]. Conversely, our network will have 10 output neurons:
classes = 10 Y_train = np_utils.to_categorical(Y_train, classes) Y_test = np_utils.to_categorical(Y_test, classes)
- Before calling our main function, we need to set the size of the input layer (the size of the MNIST images), the number of hidden neurons, the number of epochs to train the network, and the mini batch size:
input_size = 784 batch_size = 100 hidden_neurons = 100 epochs = 100
- We are ready to define our network. In this case, we'll use the Sequential model, where each layer serves as an input to the next. In Keras, Dense means fully-connected layer. We'll use a network with one hidden layer, sigmoid activation, and softmax output:
model = Sequential([
Dense(hidden_neurons, input_dim=input_size),
Activation('sigmoid'),
Dense(classes),
Activation('softmax')
])
- Keras now provides a simple way to specify the cost function (the loss) and its optimization, in this case, cross-entropy and stochastic gradient descent. We'll use the default values for learning rate, momentum, and so on:
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='sgd')
Softmax and cross-entropy
In the Logistic regression section of Chapter 2, Neural Networks, we learned how to apply regression to binary classification (two classes) problems. The softmax function is a generalization of this concept for multiple classes. Let's look at the following formula:

Here, i, j = 0, 1, 2, ... n and xi represent each of n arbitrary real values, corresponding to n mutually exclusive classes. The softmax "squashes" the input values in the (0, 1) interval, similar to the logistic function. But it has the additional property that the sum of all the squashed outputs adds up to 1. We can interpret the softmax outputs as a normalized probability distribution of the classes. Then, it makes sense to use a loss function, which compares the difference between the estimated class probabilities and the actual class distribution (the difference is known as cross-entropy). As we mentioned in step 5 of this section, the actual distribution is usually a one-hot-encoded vector, where the real class has a probability of 1, and all others have a probability of 0. The loss function that does this is called cross-entropy loss:

Here, qi(x) is the estimated probability of the output to belong to class i (out of n total classes) and pi(x) is the actual probability. When we use one-hot-encoded target values for pi(x), only the target class has a non-zero value (1) and all others are zeros. In this case, cross entropy loss will only capture the error on the target class and will discard all other errors. For the sake of simplicity, we'll assume that we apply the formula over a single training sample.
- We are ready to train the network. In Keras, we can do this in a simple way, with the fit method:
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=epochs, verbose=1)
- All that's left to do is to add code to evaluate the network accuracy on the test data:
score = model.evaluate(X_test, Y_test, verbose=1) print('Test accuracy:', score[1])
And that's it. The test accuracy will be about 96%, which is not a great result, but this example runs in less than 30 seconds on a CPU. We can make some simple improvements, such as a larger number of hidden neurons, or a higher number of epochs. We'll leave those experiments to you, to familiarize yourself with the code.
- To see what the network has learned, we can visualize the weights of the hidden layer. The following code allows us to obtain them:
weights = model.layers[0].get_weights()
- To do this, we'll reshape the weights for each neuron back to a 28x28 two-dimensional array:
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy
fig = plt.figure()
w = weights[0].T
for neuron in range(hidden_neurons):
ax = fig.add_subplot(10, 10, neuron + 1)
ax.axis("off")
ax.imshow(numpy.reshape(w[neuron], (28, 28)), cmap=cm.Greys_r)
plt.savefig("neuron_images.png", dpi=300)
plt.show()
- And we can see the result in the following image:

For simplicity, we've aggregated the images of all neurons in a single figure that represents a composite of all neurons. Clearly, since the initial images are very small and do not have lots of details (they are just digits), the features learned by the hidden neurons are not all that interesting. But it's already clear that each neuron is learning a different shape.
- 計算機網絡
- 算法基礎:打開程序設計之門
- Raspberry Pi 2 Server Essentials
- Mastering Kali Linux for Web Penetration Testing
- Access 2016數據庫管
- 移動界面(Web/App)Photoshop UI設計十全大補
- Natural Language Processing with Java and LingPipe Cookbook
- C++從入門到精通(第6版)
- Mastering Elixir
- QlikView Unlocked
- Mobile Forensics:Advanced Investigative Strategies
- Java EE 7 Development with WildFly
- C語言程序設計實驗指導
- MySQL從入門到精通
- Mastering Python for Data Science