- Practical Convolutional Neural Networks
- Mohit Sewak Md. Rezaul Karim Pradeep Pujari
- 508字
- 2021-06-24 18:58:50
The simplest artificial neural network
The following image represents a simple two-layer neural network:

The first layer is the input layer and the last layer is the output layer. The middle layer is the hidden layer. If there is more than one hidden layer, then such a network is a deep neural network.
The input and output of each neuron in the hidden layer is connected to each neuron in the next layer. There can be any number of neurons in each layer depending on the problem. Let us consider an example. The simple example which you may already know is the popular hand written digit recognition that detects a number, say 5. This network will accept an image of 5 and will output 1 or 0. A 1 is to indicate the image in fact is a 5 and 0 otherwise. Once the network is created, it has to be trained. We can initialize with random weights and then feed input samples known as the training dataset. For each input sample, we check the output, compute the error rate and then adjust the weights so that whenever it sees 5 it outputs 1 and for everything else it outputs a zero. This type of training is called supervised learning and the method of adjusting the weights is called backpropagation. When constructing artificial neural network models, one of the primary considerations is how to choose activation functions for hidden and output layers. The three most commonly used activation functions are the sigmoid function, hyperbolic tangent function, and Rectified Linear Unit (ReLU). The beauty of the sigmoid function is that its derivative is evaluated at z and is simply z multiplied by 1-minus z. That means:
dy/dx =σ(x)(1?σ(x))
This helps us to efficiently calculate gradients used in neural networks in a convenient manner. If the feed-forward activations of the logistic function for a given layer is kept in memory, the gradients for that particular layer can be evaluated with the help of simple multiplication and subtraction rather than implementing and re-evaluating the sigmoid function, since it requires extra exponentiation. The following image shows us the ReLU activation function, which is zero when x < 0 and then linear with slope 1 when x > 0:

The ReLU is a nonlinear function that computes the function f(x)=max(0, x). That means a ReLU function is 0 for negative inputs and x for all inputs x >0. This means that the activation is thresholded at zero (see the preceding image on the left). TensorFlow implements the ReLU function in tf.nn.relu():

- MySQL高可用解決方案:從主從復(fù)制到InnoDB Cluster架構(gòu)
- 復(fù)雜性思考:復(fù)雜性科學(xué)和計算模型(原書第2版)
- DB29forLinux,UNIX,Windows數(shù)據(jù)庫管理認(rèn)證指南
- Mastering Ninject for Dependency Injection
- 數(shù)據(jù)革命:大數(shù)據(jù)價值實現(xiàn)方法、技術(shù)與案例
- 軟件成本度量國家標(biāo)準(zhǔn)實施指南:理論、方法與實踐
- Dependency Injection with AngularJS
- The Game Jam Survival Guide
- INSTANT Apple iBooks How-to
- 區(qū)塊鏈+:落地場景與應(yīng)用實戰(zhàn)
- Access數(shù)據(jù)庫開發(fā)從入門到精通
- 大數(shù)據(jù)測試技術(shù):數(shù)據(jù)采集、分析與測試實踐(在線實驗+在線自測)
- 成功之路:ORACLE 11g學(xué)習(xí)筆記
- Flume日志收集與MapReduce模式
- GameMaker Game Programming with GML