官术网_书友最值得收藏!

Putting it all together with an example

As we already mentioned, multi-layer neural networks can classify linearly separable classes. In fact, the Universal Approximation Theorem states that any continuous functions on compact subsets of Rn can be approximated by a neural network with at least one hidden layer. The formal proof of such a theorem is too complex to be explained here, but we'll attempt to give an intuitive explanation using some basic mathematics. We'll implement a neural network that approximates the boxcar function, in the following diagram on the right, which is a simple type of step function. Since a series of step functions can approximate any continuous function on a compact subset of R, this will give us an idea of why the Universal Approximation Theorem holds:

The diagram on the left depicts continuous function approximation with a series of step functions, while the diagram on the right illustrates a single boxcar step function

To do this, we'll use the logistic sigmoid activation function. As we know, the logistic sigmoid is defined as  where  :

  1. Let's assume that we have only one input neuron, x = x1
  2. In the following diagrams, we can see that by making w very large, the sigmoid becomes close to a step function. On the other hand, b will simply translate the function along the x axis, and the translation t will be equal to -b/w (t = -b/w):
On the left, we have a standard sigmoid with a weight of 1 and a bias of 0; in the middle, we have a sigmoid with a weight of 10; and on the right, we have a sigmoid with a weight of 10 and a bias of 50

With this in mind, let's define the architecture of our network. It will have a single input neuron, one hidden layer with two neurons, and a single output neuron:

Both hidden neurons use the logistic sigmoid activation. The weights and biases of the network are organized in such a way as to take advantage of the sigmoid properties we described previously. The top neuron will initiate the first transition t1 (0 to 1), and then, after some time has elapsed, the second neuron will initiate the opposite transition t2. The following code implements this example:

# The user can modify the values of the weight w
# as well as bias_value_1 and bias_value_2 to observe
# how this plots to different step functions

import matplotlib.pyplot as plt
import numpy

weight_value = 1000

# modify to change where the step function starts
bias_value_1 = 5000

# modify to change where the step function ends
bias_value_2 = -5000

# plot the
plt.axis([-10, 10, -1, 10])

print("The step function starts at {0} and ends at {1}"
.format(-bias_value_1 / weight_value,
-bias_value_2 / weight_value))

inputs = numpy.arange(-10, 10, 0.01)
outputs = list()

# iterate over a range of inputs
for x in inputs:
y1 = 1.0 / (1.0 + numpy.exp(-weight_value * x - bias_value_1))
y2 = 1.0 / (1.0 + numpy.exp(-weight_value * x - bias_value_2))

# modify to change the height of the step function
w = 7

# network output
y = y1 * w - y2 * w

outputs.append(y)

plt.plot(inputs, outputs, lw=2, color='black')
plt.show()

We set large values for weight_value, bias_value_1, and bias_value_2. In this way, the expressions numpy.exp(-weight_value * x - bias_value_1) and numpy.exp(-weight_value * x - bias_value_2) can switch between 0 and infinity in a very short interval of the input. In turn,y1 and y2 will switch between 1 and 0. This would make for a stepwise (as opposed to gradual) logistic sigmoid shape, as explained previously. Because the numpy.exp expressions get an infinity value, the code will produce overflow encountered in exp warning, but this is normal. 

This code, when executed, produces the following result:

 

主站蜘蛛池模板: 临泉县| 耒阳市| 汉中市| 惠来县| 宜黄县| 岫岩| 广河县| 延吉市| 宁德市| 肇东市| 海阳市| 保康县| 榆树市| 商都县| 侯马市| 蒲城县| 平江县| 原阳县| 鄄城县| 虹口区| 车致| 东兴市| 陵川县| 施甸县| 巴彦淖尔市| 黄骅市| 库尔勒市| 贵州省| 永泰县| 福贡县| 温州市| 伊通| 高青县| 盐池县| 临猗县| 长岛县| 湖北省| 襄汾县| 南昌县| 得荣县| 清徐县|