書名： Neural Networks with Keras Cookbook
作者名： V Kishore Ayyadevara
本章字數(shù)： 680字
更新時間： 2021-07-02 12:46:23

There's more...

In the previous section, we built a regression formula (Y = a*x + b) where we wrote a function to identify the optimal values of a and b. In this section, we will build a simple neural network with a hidden layer that connects the input to the output on the same toy dataset that we worked on in the previous section.

We define the model as follows (the code file is available as Neural_networks_multiple_layers.ipynb in GitHub):

The input is connected to a hidden layer that has three units
The hidden layer is connected to the output, which has one unit in output layer

Let us go ahead and code up the strategy discussed above, as follows:

Define the dataset and import the relevant packages:

from copy import deepcopy
import numpy as np

x = [[1],[2],[3],[4]]
y = [[2],[4],[6],[8]]

We use deepcopy so that the value of the original variable does not change when the variable to which the original variable's values are copied has its values changed.

Initialize the weight and bias values randomly. The hidden layer has three units in it. Hence, there are a total of three weight values and three bias values – one corresponding to each of the hidden units.

Additionally, the final layer has one unit that is connected to the three units of the hidden layer. Hence, a total of three weights and one bias dictate the value of the output layer.

The randomly-initialized weights are as follows:

w = [[[-0.82203424, -0.9185806 , 0.03494298]], [0., 0., 0.], [[ 1.0692896 ],[ 0.62761235],[-0.5426246 ]], [0]]

Implement the feed-forward network where the hidden layer has a ReLU activation in it:

def feed_forward(inputs, outputs, weights):
     pre_hidden = np.dot(inputs,weights[0])+ weights[1]
     hidden = np.where(pre_hidden<0, 0, pre_hidden) 
     out = np.dot(hidden, weights[2]) + weights[3]
     squared_error = (np.square(out - outputs))
     return squared_error

Define the back-propagation function similarly to what we did in the previous section. The only difference is that we now have to update the weights in more layers.

In the following code, we are calculating the original loss at the start of an epoch:

def update_weights(inputs, outputs, weights, epochs): 
     for epoch in range(epochs):
         org_loss = feed_forward(inputs, outputs, weights)

In the following code, we are copying weights into two sets of weight variables so that they can be reused in a later code:

        wts_new = deepcopy(weights)
        wts_new2 = deepcopy(weights)

In the following code, we are updating each weight value by a small amount and then calculating the loss value corresponding to the updated weight value (while every other weight is kept unchanged). Additionally, we are ensuring that the weight update happens across all weights and also across all layers in a network.

The change in the squared loss (del_loss) is attributed to the change in the weight value. We repeat the preceding step for all the weights that exist in the network:

         for i, layer in enumerate(reversed(weights)):
            for index, weight in np.ndenumerate(layer):
                wts_tmp[-(i+1)][index] += 0.0001
                loss = feed_forward(inputs, outputs, wts_tmp)
                del_loss = np.sum(org_loss - loss)/(0.0001*len(inputs))

The weight value is updated by weighing down by the learning rate parameter – a greater decrease in loss will update weights by a lot, while a lower decrease in loss will update the weight by a small amount:

               wts_tmp2[-(i+1)][index] += del_loss*0.01
               wts_tmp = deepcopy(weights)

Given that the weight values are updated one at a time in order to estimate their impact on the loss value, there is a potential to parallelize the process of weight updates. Hence, GPUs come in handy in such scenarios as they have more cores than a CPU and thus more weights can be updated using a GPU in a given amount of time compared to a CPU.

Finally, we return the updated weights:

                    
          weights = deepcopy(wts_tmp2)
 return wts_tmp2

Run the function an epoch number of times to update the weights an epoch number of times:

update_weights(x,y,w,1)

The output (updated weights) of preceding code is as follows:

In the preceding steps, we learned how to build a neural network from scratch in Python. In the next section, we will learn about building a neural network in Keras.

官术网_书友最值得收藏!

Neural Networks with Keras Cookbook

There's more...