官术网_书友最值得收藏!

Permutation test

Suppose that we have two processes, A and B, and the variances of these two processes are known to be equal, though unknown. Three independent observations from process A result in yields of 18, 20, and 22, while three independent observations from process B gives yields of 24, 26, and 28. Under the assumption that the yield follows a normal distribution, we would like to test whether the means of processes A and B are the same. This is a suitable case for applying the t-test, since the number of observations is smaller. An application of the t.test function shows that the two means are different to each other, and this intuitively appears to be the case.

Now, the assumption under the null hypothesis is that the means are equal, and that the variance is unknown and assumed to be equal under the two processes. Consequently, we have a genuine reason to believe that the observations from process A might well have occurred in process B too, and vice versa. We can therefore swap one observation in process B with process A, and recompute the t-test. The process can be repeated for all possible permutations of the two samples. In general, if we have m samples from population 1 and n samples from population 2, we can have

Permutation test

different samples and as many tests. An overall test can be based on such permutation samples and such tests are called permutation tests.

For process A and B observations, we will first apply the t-test and then the permutation test. The t.test is available in the core stats package and the permutation t-test is taken from the perm package:

> library(perm)
> x <- c(18,20,22); y <- c(24,26,28)
> t.test(x,y,var.equal = TRUE)
Two Sample t-test
data:  x and y
t = -3.6742346, df = 4, p-value = 0.02131164
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -10.533915871  -1.466084129
sample estimates:
mean of x mean of y 
       20        26 

The smaller p-value suggests that the means of processes A and B are not equal. Consequently, we now apply the permutation test permTS from the perm package:

> permTS(x,y)
Exact Permutation Test (network algorithm)
data:  x and y
p-value = 0.1
alternative hypothesis: true mean x - mean y is not equal to 0
sample estimates:
mean x - mean y 
             -6 

The p-value is now at 0.1, which means that the permutation test concludes that the means of the processes are equal. Does this mean that the permutation test will always lead to this conclusion, contradicting the t-test? The answer is given in the next code segment:

> x2 <- c(16,18,20,22); y2 <- c(24,26,28,30)
> t.test(x2,y2,var.equal = TRUE)
Two Sample t-test
data:  x2 and y2
t = -4.3817805, df = 6, p-value = 0.004659215
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -12.46742939  -3.53257061
sample estimates:
mean of x mean of y 
       19        27 
> permTS(x2,y2)
Exact Permutation Test (network algorithm)
data:  x2 and y2
p-value = 0.02857143
alternative hypothesis: true mean x2 - mean y2 is not equal to 0
sample estimates:
mean x2 - mean y2 
               -8 
主站蜘蛛池模板: 余庆县| 中方县| 鄱阳县| 天峻县| 布拖县| 沅陵县| 安仁县| 邵阳县| 嘉鱼县| 桐乡市| 姜堰市| 临西县| 德钦县| 天长市| 邮箱| 雅安市| 霞浦县| 南木林县| 天峨县| 西藏| 昆明市| 峨边| 清远市| 民勤县| 卢龙县| 金乡县| 武邑县| 仙游县| 株洲县| 西和县| 乐安县| 广南县| 汪清县| 大荔县| 梁山县| 茂名市| 资中县| 永靖县| 周至县| 东乡族自治县| 东台市|