- Learning Quantitative Finance with R
- Dr. Param Jeet Prashant Vats
- 356字
- 2021-07-09 19:06:52
Sampling
When building any model in finance, we may have very large datasets on which model building will be very time-consuming. Once the model is built, if we need to tweak the model again, it is going to be a time-consuming process because of the volume of data. So it is better to get the random or proportionate sample of the population data on which model building will be easier and less time-consuming. So in this section, we are going to discuss how to select a random sample and a stratified sample from the data. This will play a critical role in building the model on sample data drawn from the population data.
Random sampling
Select the sample where all the observation in the population has an equal chance. It can be done in two ways, one without replacement and the other with replacement.
A random sample without replacement can be done by executing the following code:
> RandomSample <- Sampledata[sample(1:nrow(Sampledata), 10, >+ replace=FALSE),]
This generates the following output:
Figure 2.6: Table shows random sample without replacement
A random sample with replacement can be done by executing the following code. Replacement means that an observation can be drawn more than once. So if a particular observation is selected, it is again put into the population and it can be selected again:
> RandomSample <- Sampledata[sample(1:nrow(Sampledata), 10, >+ replace=TRUE),]
This generates the following output:
Figure 2.7: Table showing random sampling with replacement
Stratified sampling
In stratified sampling, we pide the population into separate groups, called strata. Then, a probability sample (often a simple random sample) is drawn from each group. Stratified sampling has several advantages over simple random sampling. With stratified sampling, it is possible to reduce the sample size in order to get better precision.
Now let us see how many groups exist by using Flag
and Sentiments
as given in the following code:
>library(sampling) >table(Sampledata$Flag,Sampledata$Sentiments)
The output is as follows:
Figure 2.8: Table showing the frequencies across different groups
Now you can select the sample from the different groups according to your requirement:
>Stratsubset=strata(Sampledata,c("Flag","Sentiments"),size=c(6,5, >+4,3), method="srswor") > Stratsubset
The output is as follows:
Figure 2.9: Table showing output for stratified sampling
- Dreamweaver CS3網頁制作融會貫通
- 數據運營之路:掘金數據化時代
- Pig Design Patterns
- Windows內核原理與實現
- 構建高性能Web站點
- Apache Superset Quick Start Guide
- Visual FoxPro數據庫基礎及應用
- Linux嵌入式系統開發
- SMS 2003部署與操作深入指南
- Citrix? XenDesktop? 7 Cookbook
- Mastering pfSense
- Mastering Geospatial Analysis with Python
- 計算機硬件技術基礎(第2版)
- 傳感技術基礎與技能實訓
- 玩轉機器人:基于Proteus的電路原理仿真(移動視頻版)