- Learning Quantitative Finance with R
- Dr. Param Jeet Prashant Vats
- 1457字
- 2021-07-09 19:06:53
Hypothesis testing
Hypothesis testing is used to reject or retain a hypothesis based upon the measurement of an observed sample. We will not be going into theoretical aspects but will be discussing how to implement the various scenarios of hypothesis testing in R.
Lower tail test of population mean with known variance
The null hypothesis is given by where
is the hypothesized lower bound of the population mean.
Let us assume a scenario where an investor assumes that the mean of daily returns of a stock since inception is greater than $10. The average of 30 days' daily return sample is $9.9. Assume the population standard deviation is 0.011. Can we reject the null hypothesis at .05
significance level?
Now let us calculate the test statistics z
which can be computed by the following code in R:
> xbar= 9.9 > mu0 = 10 > sig = 1.1 > n = 30 > z = (xbar-mu0)/(sig/sqrt(n)) > z
Here:
xbar
: Sample meanmu
: Hypothesized valuesig
: Standard deviation of populationn
: Sample sizez
: Test statistics
This gives the value of z
the test statistics:
[1] -0.4979296
Now let us find out the critical value at 0.05
significance level. It can be computed by the following code:
> alpha = .05 > z.alpha = qnorm(1-alpha) > -z.alpha
This gives the following output:
[1] -1.644854
Since the value of the test statistics is greater than the critical value, we fail to reject the null hypothesis claim that the return is greater than $10.
In place of using the critical value test, we can use the pnorm
function to compute the lower tail of Pvalue test statistics. This can be computed by the following code:
> pnorm(z)
This gives the following output:
[1] 0.3092668
Since the Pvalue is greater than 0.05
, we fail to reject the null hypothesis.
Upper tail test of population mean with known variance
The null hypothesis is given by where
is the hypothesized upper bound of the population mean.
Let us assume a scenario where an investor assumes that the mean of daily returns of a stock since inception is at most $5. The average of 30 days' daily return sample is $5.1. Assume the population standard deviation is 0.25. Can we reject the null hypothesis at .05
significance level?
Now let us calculate the test statistics z
, which can be computed by the following code in R:
> xbar= 5.1 > mu0 = 5 > sig = .25 > n = 30 > z = (xbar-mu0)/(sig/sqrt(n)) > z
Here:
xbar
: Sample meanmu0
: Hypothesized valuesig
: Standard deviation of populationn
: Sample sizez
: Test statistics
It gives 2.19089
as the value of test statistics. Now let us calculate the critical value at .05
significance level, which is given by the following code:
> alpha = .05 > z.alpha = qnorm(1-alpha) > z.alpha
This gives 1.644854
, which is less than the value computed for the test statistics. Hence we reject the null hypothesis claim.
Also, the Pvalue of the test statistics is given as follows:
>pnorm(z, lower.tail=FALSE)
This gives 0.01422987
, which is less than 0.05
and hence we reject the null hypothesis.
Two-tailed test of population mean with known variance
The null hypothesis is given by where
is the hypothesized value of the population mean.
Let us assume a scenario where the mean of daily returns of a stock last year is $2. The average of 30 days' daily return sample is $1.5 this year. Assume the population standard deviation is .2. Can we reject the null hypothesis that there is not much significant difference in returns this year from last year at .05
significance level?
Now let us calculate the test statistics z
, which can be computed by the following code in R:
> xbar= 1.5 > mu0 = 2 > sig = .1 > n = 30 > z = (xbar-mu0)/(sig/sqrt(n)) > z
This gives the value of test statistics as -27.38613
.
Now let us try to find the critical value for comparing the test statistics at .05
significance level. This is given by the following code:
>alpha = .05 >z.half.alpha = qnorm(1-alpha/2) >c(-z.half.alpha, z.half.alpha)
This gives the value -1.959964
, 1.959964
. Since the value of test statistics is not between the range (-1.959964
, 1.959964
), we reject the claim of the null hypothesis that there is not much significant difference in returns this year from last year at .05
significance level.
The two-tailed Pvalue statistics is given as follows:
>2*pnorm(z)
This gives a value less than .05
so we reject the null hypothesis.
In all the preceding scenarios, the variance is known for population and we use the normal distribution for hypothesis testing. However, in the next scenarios, we will not be given the variance of the population so we will be using t
distribution for testing the hypothesis.
Lower tail test of population mean with unknown variance
The null hypothesis is given by where
is the hypothesized lower bound of the population mean.
Let us assume a scenario where an investor assumes that the mean of daily returns of a stock since inception is greater than $1. The average of 30 days' daily return sample is $.9. Assume the population standard deviation is 0.01. Can we reject the null hypothesis at .05
significance level?
In this scenario, we can compute the test statistics by executing the following code:
> xbar= .9 > mu0 = 1 > sig = .1 > n = 30 > t = (xbar-mu0)/(sig/sqrt(n)) > t
Here:
xbar
: Sample meanmu0
: Hypothesized valuesig
: Standard deviation of samplen
: Sample sizet
: Test statistics
This gives the value of the test statistics as -5.477226
. Now let us compute the critical value at .05
significance level. This is given by the following code:
> alpha = .05 > t.alpha = qt(1-alpha, df=n-1) > -t.alpha
We get the value as -1.699127
. Since the value of the test statistics is less than the critical value, we reject the null hypothesis claim.
Now instead of the value of the test statistics, we can use the Pvalue associated with the test statistics, which is given as follows:
>pt(t, df=n-1)
This results in a value less than .05 so we can reject the null hypothesis claim.
Upper tail test of population mean with unknown variance
The null hypothesis is given by where
is the hypothesized upper bound of the population mean.
Let us assume a scenario where an investor assumes that the mean of daily returns of a stock since inception is at most $3. The average of 30 days' daily return sample is $3.1. Assume the population standard deviation is .2
. Can we reject the null hypothesis at .05
significance level?
Now let us calculate the test statistics t
which can be computed by the following code in R:
> xbar= 3.1 > mu0 = 3 > sig = .2 > n = 30 > t = (xbar-mu0)/(sig/sqrt(n)) > t
Here:
xbar
: Sample meanmu0
: Hypothesized valuesig
: Standard deviation of samplen
: Sample sizet
: Test statistics
This gives the value 2.738613
of the test statistics. Now let us find the critical value associated with the .05
significance level for the test statistics. It is given by the following code:
> alpha = .05 > t.alpha = qt(1-alpha, df=n-1) > t.alpha
Since the critical value 1.699127
is less than the value of the test statistics, we reject the null hypothesis claim.
Also, the value associated with the test statistics is given as follows:
>pt(t, df=n-1, lower.tail=FALSE)
This is less than .05
. Hence the null hypothesis claim gets rejected.
Two tailed test of population mean with unknown variance
The null hypothesis is given by , where
is the hypothesized value of the population mean.
Let us assume a scenario where the mean of daily returns of a stock last year is $2. The average of 30 days' daily return sample is $1.9 this year. Assume the population standard deviation is .1
. Can we reject the null hypothesis that there is not much significant difference in returns this year from last year at .05
significance level?
Now let us calculate the test statistics t
, which can be computed by the following code in R:
> xbar= 1.9 > mu0 = 2 > sig = .1 > n = 30 > t = (xbar-mu0)/(sig/sqrt(n)) > t
This gives -5.477226
as the value of the test statistics. Now let us try to find the critical value range for comparing, which is given by the following code:
> alpha = .05 > t.half.alpha = qt(1-alpha/2, df=n-1) > c(-t.half.alpha, t.half.alpha)
This gives the range value (-2.04523
, 2.04523
). Since this is the value of the test statistics, we reject the claim of the null hypothesis.
- 虛擬儀器設計測控應用典型實例
- Dreamweaver CS3 Ajax網頁設計入門與實例詳解
- Cinema 4D R13 Cookbook
- 計算機圖形學
- Dreamweaver 8中文版商業案例精粹
- 大數據平臺異常檢測分析系統的若干關鍵技術研究
- RPA(機器人流程自動化)快速入門:基于Blue Prism
- Visual FoxPro數據庫基礎及應用
- 空間站多臂機器人運動控制研究
- 空間機械臂建模、規劃與控制
- 嵌入式GUI開發設計
- 三菱FX/Q系列PLC工程實例詳解
- 51單片機應用程序開發與實踐
- 基于Quartus Ⅱ的數字系統Verilog HDL設計實例詳解
- Hands-On Generative Adversarial Networks with Keras