官术网_书友最值得收藏!

Neural networks

In the previous sections, we introduced some of the popular classical machine learning algorithms. In this section, we'll talk about neural networks, which is the main focus of the book.

The first example of a neural network is called the perceptron, and this was invented by Frank Rosenblatt in 1957. The perceptron is a classification algorithm that is very similar to logistic regression. Such as logistic regression, it has weights, wand its output is a function, , of the dot product,  (or  of the weights and input.

The only difference is that f is a simple step function, that is, if , then , or else , wherein we apply a similar logistic regression rule over the output of the logistic function. The perceptron is an example of a simple one-layer neural feedforward network:

A simple perceptron with three input units (neurons) and one output unit (neuron)

The perceptron was very promising, but it was soon discovered that is has serious limitations as it only works for linearly-separable classes. In 1969, Marvin Minsky and Seymour Papert demonstrated that it could not learn even a simple logical function such as XOR. This led to a significant decline in the interest in perceptron's.

However, other neural networks can solve this problem. A classic multilayer perceptron has multiple interconnected perceptron's, such as units that are organized in different sequential layers (input layer, one or more hidden layers, and an output layer). Each unit of a layer is connected to all units of the next layer. First, the information is presented to the input layer, then we use it to compute the output (or activation), yi, for each unit of the first hidden layer. We propagate forward, with the output as input for the next layers in the network (hence feedforward), and so on until we reach the output. The most common way to train neural networks is with a gradient descent in combination with backpropagation. We'll discuss this in detail in chapter 2, Neural Networks.

The following diagram depicts the neural network with one hidden layer:

Neural network with one hidden layer

Think of the hidden layers as an abstract representation of the input data. This is the way the neural network understands the features of the data with its own internal logic. However, neural networks are non-interpretable models. This means that if we observed the yi activations of the hidden layer, we wouldn't be able to understand them. For us, they are just a vector of numerical values. To bridge the gap between the network's representation and the actual data we're interested in, we need the output layer. You can think of this as a translator; we use it to understand the network's logic, and at the same time, we can convert it to the actual target values that we are interested in. 

The Universal approximation theorem tells us that a feedforward network with one hidden layer can represent any function. It's good to know that there are no theoretical limits on networks with one hidden layer, but in practice we can achieve limited success with such architectures. In Chapter 3, Deep Learning Fundamentals, we'll discuss how to achieve better performance with deep neural networks, and their advantages over the shallow ones. For now, let's apply our knowledge by solving a simple classification task with a neural network.

主站蜘蛛池模板: 兴安县| 雷波县| 烟台市| 渑池县| 松原市| 香港| 长岭县| 额济纳旗| 普安县| 邮箱| 西昌市| 和田市| 祥云县| 互助| 绵竹市| 东山县| 屏南县| 桦南县| 襄垣县| 政和县| 什邡市| 丹棱县| 镇沅| 宁明县| 佛坪县| 玉龙| 涞水县| 新邵县| 三穗县| 南阳市| 呼和浩特市| 云南省| 九江市| 自贡市| 宁夏| 神农架林区| 泗水县| 潢川县| 界首市| 丹巴县| 班玛县|