官术网_书友最值得收藏!

Validating Insights Using Statistical Tests

Throughout the journey of EDA, we have collected and noted some interesting patterns for further validation. It is now the right time to test whether whatever we observed previously are actually valid patterns or just appeared to be interesting due to random chance. The most effective and straightforward way to approach this validation is by performing a set of statistical tests and measuring the statistical significance of the pattern. We have a ton of options in the available set of tests to choose from. The options vary based on the type of independent and dependent variable. The following is a handy reference diagram that explains the types of statistical test that we can perform to validate our observed patterns:

Figure 2.24: Validating dependent and independent variables

Let's collect all our interesting patterns into one place here:

  • The campaign outcome has a higher chance of yes when the employee variance rate is low.
  • The campaign outcome has a higher chance of yes when the euro interest rates are low.
  • Single clients have a higher chance of responding positively to the campaign.
  • Student and retired clients have a higher chance of responding positively to the campaign.
  • Cellular contacts have a higher chance of responding positively to the campaign.

If you try to categorize these hypotheses, we can see that we have a categorical dependent variable in all cases. So, we should use a chi-squared test or logistic regression test to validate our results.

Let's perform these tests one by one.

主站蜘蛛池模板: 鄂尔多斯市| 比如县| 忻州市| 乐清市| 卢龙县| 永顺县| 怀安县| 都江堰市| 汕尾市| 河北省| 宁武县| 宜兰市| 大宁县| 新邵县| 泾源县| 松原市| 宁阳县| 武宣县| 旬邑县| 博湖县| 叶城县| 玉门市| 祁连县| 瑞丽市| 绵阳市| 县级市| 交城县| 吉林市| 澄城县| 马尔康县| 邵阳市| 金溪县| 平顺县| 奉节县| 武汉市| 突泉县| 泸水县| 海安县| 渭源县| 中牟县| 库车县|