官术网_书友最值得收藏!

Multiple linear regression concepts

So far, we have resolved simple linear regression problems that study the relation between a dependent variable, y, and an independent variable, x, based on the following regression equation:

                                                          

In this equation, the explanatory variable is represented by x and the response variable is represented by y. To solve this problem, the least squares method was used. In this method, we can find the best fit by minimizing the sum of squares of the vertical distances from each data point on the line. As mentioned previously, we don't find that a variable depends solely on another very often. Usually, we find that the response variable depends on at least two predictors. In practice, we will have to create models with a response variable that depend on more than one predictor. These models are known as multiple linear regression models. It is a straightforward generalization of single predictor models. According to multiple linear regression models, the dependent variable is related to two or more independent variables.

The general model for n variables is of the following form:

                                     

Here, x1, x2,.. xn are the n predictors and y is the only response variable. The coefficients βi measure the change in the y value, associated with a change in xi, keeping all the other variables constant. The simple linear regression model is used to find a straight line that best fits the data. On the other hand, multiple linear regression models, for example, with two independent variables, are used to find a plane that best fits the data; more generally, it is a multidimensional plane. The goal is to find the surface that best fits our predictors in terms of minimizing the overall squared distance between itself and the response variable.

To estimate β similarly to what we did in the simple linear regression case, we want to minimize the following term over all possible values of intercepts and slopes:

                           

Just as we did in the case of simple linear regression, we can represent the previous equation in matrix form, as follows:

                         

We can name the terms contained in this formula as follows:

                     

This can be reexpressed using a condensed formulation:

                                                              

Finally, to determine the intercept and slope through the least squares method, we have to solve the previous equation with respect to β, as follows (we must estimate the coefficients with the normal equation):

                                               

Basically, it represents the same equation that we looked at previously. We can then calculate the intercept and slope of the regression line.

主站蜘蛛池模板: 平武县| 祥云县| 夏邑县| 宣恩县| 台中县| 五莲县| 吴堡县| 庐江县| 宁南县| 高尔夫| 福鼎市| 和静县| 城固县| 濉溪县| 万年县| 揭阳市| 大化| 韩城市| 那曲县| 九寨沟县| 凤冈县| 永川市| 蒙自县| 新郑市| 建湖县| 吕梁市| 前郭尔| 普兰店市| 苏尼特右旗| 巴彦县| 龙井市| 宝坻区| 金昌市| 晋江市| 南开区| 小金县| 宝清县| 米林县| 丹阳市| 和静县| 金门县|