書名： Hands-On Data Science with R
作者名： Vitor Bianchi Lanzetta Nataraj Dasgupta Ricardo Anjoleto Farias
本章字?jǐn)?shù)： 679字
更新時間： 2021-06-10 19:12:32

Statistical hypothesis testing

Imagine that you have estimated something about your data that you don't know for sure. Assuming that what you have imagined is true, what are the chances of getting the estimations that you found or even more extreme values? This is hypothesis testing. Statistical hypothesis testing (or simply, hypothesis testing, HT) is the name given to a set of well-known, practical methods used to make inferences with statistics. As long you have data and you're willing to make some inferences about it, the odds are that HT is the way to go. It can work out a great variety of real-world problems.

Although it's usually better to work with experimental data, it's also possible to statistically test hypotheses using observational data as well. Exhibit A: economists all over the world are doing it. A medical treatment's effectiveness, production quality (quality control), and guessing abilities can all be tested under the guidelines of HT. It's particularly easy to design a test to check whether that friend of yours has psychic powers or not. As Bob Rudis would say: in God we trust, all the others must bring data.

Have you met Rudis? https://rud.is/b/.

This section is going to bring you data and the very popular tests known as the z-test, t-test, and A/B test. We will be also discussing the paradigm of hypothesis acceptance and how it has evolved over the years, but before going any further let's get to know concepts that are very likely to show up while doing any sort of HT:

Null hypothesis (H₀): Generally assumed to be true at the test's start. Usually, it states values for the mean (μ) or variance (σ²). Sometimes, it's phrased as a conclusion such as the defendant is innocent or my friend can't read minds, but what is truly being tested is some parameter.
Alternative hypothesis (H_a): It's the counterpart of the null hypothesis. The great statistician Ronald Fisher stressed that an alternative hypothesis is always required. If the null hypothesis is rejected, it's rejected in favor of the alternative hypothesis. Out in the real world, there are consequences implied in rejecting or failing to reject the null hypothesis. It's always better to take consequences into account before deciding anything.
Type I error: To reject the null hypothesis when it was actually true is to commit a type I error. Comparably, to accept the null hypothesis when it was actually false is to commit a type II error.
Significance level (α): To put it simply, it's a threshold. It can be seen as the greatest probability of committing a type I error that the user is willing to risk in order to reject the null hypothesis. By the way, lower is better.

For the later concept, during the early days of modern statistics, researchers would fix it at a rigorous level (5% was a very popular number) and then infer something such as we were able to reject the null hypothesis at 5% significance level or we failed to reject the null hypothesis given the significance level of 5%.

Researchers prefer to say we failed to reject the null hypothesis instead of we accepted the null hypothesis.

Thumb rules such as reject your null hypothesis if you can do it with at least 5% significance level are still useful these days, but there are even more rigorous and reasonable approaches. Considering the likelihood of committing a type I error, an alternative approach estimates and calculates expected costs and revenues coming from going for one or an other hypothesis.

Using the alternative approach, a doctor is likely to prescribe deworming medicine to a patient if they suspect the person has worms rather than prescribe a medical exam. The exam is much more expensive than the medicine, while the former hardly shows any collaterals besides being inexpensive. This approach requires more work. For the moment, let's try the classic approach while running a t-test.

官术网_书友最值得收藏!

Hands-On Data Science with R

Statistical hypothesis testing