- Applied Supervised Learning with R
- Karthik Ramasubramanian Jojo Moolayil
- 253字
- 2021-06-11 13:22:33
Validating Insights Using Statistical Tests
Throughout the journey of EDA, we have collected and noted some interesting patterns for further validation. It is now the right time to test whether whatever we observed previously are actually valid patterns or just appeared to be interesting due to random chance. The most effective and straightforward way to approach this validation is by performing a set of statistical tests and measuring the statistical significance of the pattern. We have a ton of options in the available set of tests to choose from. The options vary based on the type of independent and dependent variable. The following is a handy reference diagram that explains the types of statistical test that we can perform to validate our observed patterns:

Figure 2.24: Validating dependent and independent variables
Let's collect all our interesting patterns into one place here:
- The campaign outcome has a higher chance of yes when the employee variance rate is low.
- The campaign outcome has a higher chance of yes when the euro interest rates are low.
- Single clients have a higher chance of responding positively to the campaign.
- Student and retired clients have a higher chance of responding positively to the campaign.
- Cellular contacts have a higher chance of responding positively to the campaign.
If you try to categorize these hypotheses, we can see that we have a categorical dependent variable in all cases. So, we should use a chi-squared test or logistic regression test to validate our results.
Let's perform these tests one by one.
- 數字道路技術架構與建設指南
- The Applied AI and Natural Language Processing Workshop
- 筆記本電腦維修不是事兒(第2版)
- Creating Flat Design Websites
- Hands-On Artificial Intelligence for Banking
- 基于PROTEUS的電路設計、仿真與制板
- Wireframing Essentials
- 新編電腦組裝與硬件維修從入門到精通
- Spring Cloud實戰
- 基于網絡化教學的項目化單片機應用技術
- FPGA實驗實訓教程
- 微控制器的應用
- Service Mesh微服務架構設計
- ActionScript Graphing Cookbook
- Hands-On Markov Models with Python