- Python Deep Learning
- Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
- 413字
- 2021-07-02 14:31:05
Logistic regression
Logistic regression uses logistic sigmoid activation, in contrast to linear regression, which uses the identity function. As we've seen before, the output of the logistic sigmoid is in the (0,1) range and can be interpreted as a probability function. We can use logistic regression for a 2-class (binary) classification problem, where our target, t, can have two values, usually 0 and 1 for the two corresponding classes. These discrete values shouldn't be confused with the values of the logistic sigmoid function, which is a continuous real-valued function between 0 and 1. The value of the sigmoid function represents the probability that the output is in class 0 or class 1:
- Let's denote the logistic sigmoid function with σ(a), where a is the neuron activation value x?w, as defined previously. For each sample x, the probability that the output is of class y, given the weights w, is as follows:

- We can write that equation more succinctly as follows:

- And, since the probabilities P(ti|xi, w) are independent for each sample xi, the global probability is as follows:

- If we take the natural log of the preceding equation (to turn products into sums), we get the following:

Our objective now is to maximize this log to get the highest probability of predicting the correct results.
- As before, we'll use gradient descent to minimize the cost function J(w), defined by
.
As before, we calculate the derivative of the cost function with respect to the weights wj to obtain the following:

To understand the last equation, let's recap the chain rule for derivatives, which states that if we have the function F(x)= f(g(x)), then the derivative of F with respect to x would be F'(x) = f'(g(x))g'(x), or .
Now, back to our case:

Therefore, according to the chain rule,the following is true:

Similarly,the following applies:

This is similar to the update rule we've seen for linear regression.
In this section, we saw a number of complicated equations, but you shouldn't feel bad if you don't fully understand them. We can recap by saying that we applied the same gradient descent algorithm to logistic regression as with linear regression. But this time, finding the partial derivatives of the error function with respect to the weights is slightly more complicated.
- iOS開發(fā)實(shí)戰(zhàn):從零基礎(chǔ)到App Store上架
- 數(shù)據(jù)結(jié)構(gòu)與算法JavaScript描述
- C語言最佳實(shí)踐
- Processing互動(dòng)編程藝術(shù)
- Swift 3 New Features
- 游戲程序設(shè)計(jì)教程
- 老“碼”識(shí)途
- Learning Python Design Patterns(Second Edition)
- Xamarin.Forms Projects
- JavaScript動(dòng)態(tài)網(wǎng)頁開發(fā)詳解
- 零基礎(chǔ)學(xué)Python數(shù)據(jù)分析(升級(jí)版)
- Learning DHTMLX Suite UI
- Learning Modular Java Programming
- Penetration Testing with the Bash shell
- C++17 By Example