官术网_书友最值得收藏!

Multiple linear regression concepts

So far, we have resolved simple linear regression problems that study the relation between a dependent variable, y, and an independent variable, x, based on the following regression equation:

                                                          

In this equation, the explanatory variable is represented by x and the response variable is represented by y. To solve this problem, the least squares method was used. In this method, we can find the best fit by minimizing the sum of squares of the vertical distances from each data point on the line. As mentioned previously, we don't find that a variable depends solely on another very often. Usually, we find that the response variable depends on at least two predictors. In practice, we will have to create models with a response variable that depend on more than one predictor. These models are known as multiple linear regression models. It is a straightforward generalization of single predictor models. According to multiple linear regression models, the dependent variable is related to two or more independent variables.

The general model for n variables is of the following form:

                                     

Here, x1, x2,.. xn are the n predictors and y is the only response variable. The coefficients βi measure the change in the y value, associated with a change in xi, keeping all the other variables constant. The simple linear regression model is used to find a straight line that best fits the data. On the other hand, multiple linear regression models, for example, with two independent variables, are used to find a plane that best fits the data; more generally, it is a multidimensional plane. The goal is to find the surface that best fits our predictors in terms of minimizing the overall squared distance between itself and the response variable.

To estimate β similarly to what we did in the simple linear regression case, we want to minimize the following term over all possible values of intercepts and slopes:

                           

Just as we did in the case of simple linear regression, we can represent the previous equation in matrix form, as follows:

                         

We can name the terms contained in this formula as follows:

                     

This can be reexpressed using a condensed formulation:

                                                              

Finally, to determine the intercept and slope through the least squares method, we have to solve the previous equation with respect to β, as follows (we must estimate the coefficients with the normal equation):

                                               

Basically, it represents the same equation that we looked at previously. We can then calculate the intercept and slope of the regression line.

主站蜘蛛池模板: 呼图壁县| 通榆县| 神农架林区| 蕲春县| 渝北区| 曲周县| 铜山县| 博兴县| 通山县| 天门市| 宝山区| 晋宁县| 修文县| 萍乡市| 壶关县| 信阳市| 阿拉善盟| 江城| 巧家县| 儋州市| 盘山县| 子洲县| 阳东县| 陈巴尔虎旗| 光泽县| 西乌珠穆沁旗| 郓城县| 金阳县| 汝阳县| 遂平县| 鄄城县| 博白县| 九江县| 南平市| 晋城| 澳门| 巩留县| 商城县| 怀仁县| 乃东县| 江北区|