官术网_书友最值得收藏!

Putting it all together with an example

As we already mentioned, multi-layer neural networks can classify linearly separable classes. In fact, the Universal Approximation Theorem states that any continuous functions on compact subsets of Rn can be approximated by a neural network with at least one hidden layer. The formal proof of such a theorem is too complex to be explained here, but we'll attempt to give an intuitive explanation using some basic mathematics. We'll implement a neural network that approximates the boxcar function, in the following diagram on the right, which is a simple type of step function. Since a series of step functions can approximate any continuous function on a compact subset of R, this will give us an idea of why the Universal Approximation Theorem holds:

The diagram on the left depicts continuous function approximation with a series of step functions, while the diagram on the right illustrates a single boxcar step function

To do this, we'll use the logistic sigmoid activation function. As we know, the logistic sigmoid is defined as  where  :

  1. Let's assume that we have only one input neuron, x = x1
  2. In the following diagrams, we can see that by making w very large, the sigmoid becomes close to a step function. On the other hand, b will simply translate the function along the x axis, and the translation t will be equal to -b/w (t = -b/w):
On the left, we have a standard sigmoid with a weight of 1 and a bias of 0; in the middle, we have a sigmoid with a weight of 10; and on the right, we have a sigmoid with a weight of 10 and a bias of 50

With this in mind, let's define the architecture of our network. It will have a single input neuron, one hidden layer with two neurons, and a single output neuron:

Both hidden neurons use the logistic sigmoid activation. The weights and biases of the network are organized in such a way as to take advantage of the sigmoid properties we described previously. The top neuron will initiate the first transition t1 (0 to 1), and then, after some time has elapsed, the second neuron will initiate the opposite transition t2. The following code implements this example:

# The user can modify the values of the weight w
# as well as bias_value_1 and bias_value_2 to observe
# how this plots to different step functions

import matplotlib.pyplot as plt
import numpy

weight_value = 1000

# modify to change where the step function starts
bias_value_1 = 5000

# modify to change where the step function ends
bias_value_2 = -5000

# plot the
plt.axis([-10, 10, -1, 10])

print("The step function starts at {0} and ends at {1}"
.format(-bias_value_1 / weight_value,
-bias_value_2 / weight_value))

inputs = numpy.arange(-10, 10, 0.01)
outputs = list()

# iterate over a range of inputs
for x in inputs:
y1 = 1.0 / (1.0 + numpy.exp(-weight_value * x - bias_value_1))
y2 = 1.0 / (1.0 + numpy.exp(-weight_value * x - bias_value_2))

# modify to change the height of the step function
w = 7

# network output
y = y1 * w - y2 * w

outputs.append(y)

plt.plot(inputs, outputs, lw=2, color='black')
plt.show()

We set large values for weight_value, bias_value_1, and bias_value_2. In this way, the expressions numpy.exp(-weight_value * x - bias_value_1) and numpy.exp(-weight_value * x - bias_value_2) can switch between 0 and infinity in a very short interval of the input. In turn,y1 and y2 will switch between 1 and 0. This would make for a stepwise (as opposed to gradual) logistic sigmoid shape, as explained previously. Because the numpy.exp expressions get an infinity value, the code will produce overflow encountered in exp warning, but this is normal. 

This code, when executed, produces the following result:

 

主站蜘蛛池模板: 山西省| 郧西县| 柏乡县| 隆回县| 建平县| 伊吾县| 得荣县| 双城市| 北流市| 柏乡县| 洪泽县| 获嘉县| 布尔津县| 云龙县| 房产| 江源县| 郸城县| 石台县| 山丹县| 大渡口区| 旬邑县| 郯城县| 凤阳县| 盖州市| 邵东县| 调兵山市| 伊通| 常熟市| 永清县| 丰顺县| 新和县| 青田县| 北票市| 宁陕县| 论坛| 吉木萨尔县| 旺苍县| 毕节市| 班戈县| 灵丘县| 从化市|