官术网_书友最值得收藏!

Understanding linear regression

The easiest regression model is called linear regression. The idea behind linear regression is to describe a target variable (such as Boston house pricing) with a linear combination of features.

To keep things simple, let's just focus on two features. Let's say we want to predict tomorrow's stock prices using two features: today's stock price and yesterday's stock price. We will denote today's stock price as the first feature f1, and yesterday's stock price as f2. Then the goal of linear regression would be to learn two weight coefficients, w1 and w2, so that we can predict tomorrow's stock price as follows:

Here, ? is the prediction of tomorrow's ground truth stock price y.

The special case of having only one feature variable is called simple linear regression.

We could easily extend this to feature more stock price samples from the past. If we had M feature values instead of two, we would extend the preceding equation to a sum of M products, so that every feature gets accompanied by a weight coefficient. We can write the resulting equation as follows:

Let's think about this equation geometrically for a second. In the case of a single feature, f1, the equation for ? would become ? = w1 f1, which is essentially a straight line. In the case of two features, ? = w1 f1 + w2 f2 would describe a plane in the feature space, as illustrated in the following figure:

Prediciting target values in two and three dimensions using linear regression
In N dimensions, would become what is known as a hyperplane. If a space is N-dimensional, then its hyperplanes have N-1 dimensions.

As is evident in the preceding figure, all of these lines and planes intersect at the origin. But, what if the true y values we are trying to approximate don't go through the origin?

In order to be able to offset ? from the origin, it is customary to add an additional weight coefficient that does not depend on any feature values, and thus acts like a bias term. In a 1D case, this term acts as the ?-intercept. In practice, this is often achieved by setting f0=1 so that w0 can act as the bias term:

Finally, the goal of linear regression is to learn a set of weight coefficients that lead to a prediction that approximates the ground truth values as accurately as possible. Rather than explicitly capturing a model's accuracy like we did with classifiers, scoring functions in regression often take the form of so called cost functions (or loss functions).

As discussed earlier in this chapter, there are a number of scoring functions we can use to measure the performance of a regression model. The most commonly used cost function is probably the mean squared error, which calculates an error (yi - ?i)2 for every data point i by comparing the prediction ?i to the target output value yi and then taking the average.

Then regression becomes an optimization problem--and the task is to find the set of weights that minimizes the cost function.

This is usually done with an iterative algorithm that is applied to one data point after the other, thus reducing the cost function step by step. We will talk more deeply about such algorithms in Chapter 9, Using Deep Learning to Classify Handwritten Digits.

But enough with all this theory--let's do some coding!

主站蜘蛛池模板: 邵阳县| 漳州市| 滁州市| 璧山县| 潍坊市| 女性| 呼图壁县| 犍为县| 磐石市| 乐至县| 界首市| 泸水县| 沙田区| 大余县| 广丰县| 汝城县| 中宁县| 瓦房店市| 兴业县| 治多县| 长寿区| 通州区| 沈阳市| 莱西市| 龙门县| 马关县| 鞍山市| 岢岚县| 晋城| 芜湖市| 历史| 新野县| 普定县| 海淀区| 昭平县| 台东市| 肥东县| 龙口市| 广丰县| 达孜县| 汨罗市|