官术网_书友最值得收藏!

  • R Programming By Example
  • Omar Trejo Navarro
  • 328字
  • 2021-07-02 21:30:44

Checking linearity with scatter plots

A basic way of checking the linearity assumption is to make a scatter plot with the dependent variable in the y axis and an independent variable in the x axis. If the relation appears to be linear, the assumption is validated. In any interesting problem it's extremely hard to find a scatter plot that shows a very clear linear relation, and if it does happen we should be a little suspicious and careful with the data. To avoid reinventing the wheel, we will use the plot_scatterlot() function we created in Chapter 2, Understanding Votes with Descriptive Statistics:

plot_scatterplot(
    data = data,
    var_x = "Age_18to44",
    var_y = "Proportion",
    var_color = FALSE,
    regression = TRUE
)
plot_scatterplot(
    data = data,
    var_x = "Students",
    var_y = "Proportion",
    var_color = FALSE,
    regression = TRUE
)

As we can see, the scatter plot on the left shows a clear linear relation, as the percentage of people between 18 and 44 years of age (Age_18to44) increases, the proportion of people in favor of leaving the EU (Proportion) decreases. On the right hand, we see that the relation among the percentage of students in a ward (Students) and Proportion is clearly linear in the initial area (where Students is between 0 and 20), after that the relation too seems to be linear, but it is polluted by observations with very high percentage of students. However, we can still assume a linear relation between Students and Proportion.

When we're doing a Multiple Linear Regression as we're doing here, the assumption should be checked for the rest of the variables, which we omit here to preserve space, but we encourage you to do so. Keep in mind that it's very hard to find a linear relation in all of them, and this assumption is mostly an indicator of the predictive power of the variable in the regression. As long as the relation appears to be slightly linear, we should be all set.

主站蜘蛛池模板: 综艺| 平阳县| 资溪县| 泾源县| 舒兰市| 潍坊市| 池州市| 尉氏县| 凭祥市| 余姚市| 山阴县| 娱乐| 阿克陶县| 扎兰屯市| 临夏市| 泸州市| 景德镇市| 东明县| 泰和县| 保定市| 洛阳市| 石屏县| 南汇区| 肃北| 潜山县| 石渠县| 东丽区| 含山县| 香格里拉县| 绥芬河市| 行唐县| 通化市| 抚顺市| 榕江县| 雷山县| 元阳县| 班戈县| 洛浦县| 聂荣县| 华坪县| 贡觉县|