官术网_书友最值得收藏!

How to do it...

In this section, we will build the back-propagation algorithm by hand so that we clearly understand how weights are calculated in a neural network. In this specific case, we will build a simple neural network where there is no hidden layer (thus we are solving a regression equation). The code file is available as Neural_network_working_details.ipynb in GitHub.

  1. Initialize the dataset as follows:
x = [[1],[2],[3],[4]]
y = [[2],[4],[6],[8]]
  1. Initialize the weight and bias values randomly (we have only one weight and one bias value as we are trying to identify the optimal values of a and b in the y = a*x + b equation):
w = [[[1.477867]], [0.]]
  1. Define the feed-forward network and calculate the squared error loss value: 
import numpy as np
def feed_forward(inputs, outputs, weights):
out = np.dot(inputs,weights[0]) + weights[1]
squared_error = (np.square(out - outputs))
return squared_error

In the preceding code, we performed a matrix multiplication of the input with the randomly-initialized weight value and summed it up with the randomly-initialized bias value.

Once the value is calculated, we calculate the squared error value of the difference between the actual and predicted values.

  1. Increase each weight and bias value by a very small amount (0.0001) and calculate the squared error loss value one at a time for each of the weight and bias updates.

If the squared error loss value decreases as the weight increases, the weight value should be increased. The magnitude by which the weight value should be increased is proportional to the amount of loss value the weight change decreases by.

Additionally, ensure that you do not increase the weight value as much as the loss decrease caused by the weight change, but weigh it down with a factor called the learning rate. This ensures that the loss decreases more smoothly (there's more on how the learning rate impacts the model accuracy in the next chapter).

In the following code, we are creating a function named update_weights, which performs the back-propagation process to update weights that were obtained in step 3. We are also mentioning that the function needs to be run for epochs number of times (where epochs is a parameter we are passing to update_weights function):

def update_weights(inputs, outputs, weights, epochs): 
for epoch in range(epochs):
  1. Pass the input through a feed-forward network to calculate the loss with the initial set of weights:
        org_loss = feed_forward(inputs, outputs, weights)
  1. Ensure that you deepcopy the list of weights, as the weights will be manipulated in further steps, and hence deepcopy takes care of any issues resulting from the change in the child variable impacting the parent variable that it is pointing to:
        wts_tmp = deepcopy(weights)
wts_tmp2 = deepcopy(weights)
  1. Loop through all the weight values, one at a time, and change them by a small value (0.0001):
        for i in range(len(weights)):
wts_tmp[-(i+1)] += 0.0001
  1. Calculate the updated feed-forward loss when the weight is updated by a small amount. Calculate the change in loss due to the small change in input. Divide the change in loss by the number of input, as we want to calculate the mean squared error across all the input samples we have:
            loss = feed_forward(inputs, outputs, wts_tmp)
delta_loss = np.sum(org_loss - loss)/(0.0001*len(inputs))
Updating the weight by a small value and then calculating its impact on loss value is equivalent to performing a derivative with respect to change in weight.
  1. Update the weights by the change in loss that they are causing. Update the weights slowly by multiplying the change in loss by a very small number (0.01), which is the learning rate parameter (more about the learning rate parameter in the next chapter):
            wts_tmp2[-(i+1)] += delta_loss*0.01 
wts_tmp = deepcopy(weights)
  1. The updated weights and bias value are returned:
    weights = deepcopy(wts_tmp2)
return wts_tmp2

One of the other parameters in a neural network is the batch size considered in calculating the loss values. 

In the preceding scenario, we considered all the data points in order to calculate the loss value. However, in practice, when we have thousands (or in some cases, millions) of data points, the incremental contribution of a greater number of data points while calculating loss value would follow the law of diminishing returns and hence we would be using a batch size that is much smaller compared to the total number of data points we have.

The typical batch size considered in building a model is anywhere between 32 and 1,024.

主站蜘蛛池模板: 肇源县| 佳木斯市| 中宁县| 黔东| 岚皋县| 洮南市| 山东省| 桐柏县| 陇西县| 集贤县| 金平| 罗甸县| 仪征市| 柳林县| 潢川县| 墨玉县| 孟连| 潜江市| 阿拉善右旗| 将乐县| 江门市| 革吉县| 尚义县| 凤山县| 合肥市| 邵阳县| 大丰市| 淮安市| 香港| 西昌市| 扎赉特旗| 仲巴县| 克东县| 雷山县| 任丘市| 晋城| 浮山县| 安乡县| 桦南县| 来宾市| 满洲里市|