官术网_书友最值得收藏!

Waveform

This dataset is an example of a simulation study. Here, we have twenty-one variables as input or independent variables, and a class variable referred to as classes. The data is generated using the mlbench.waveform function from the mlbench R package. For more details, refer to the following link: ftp://ftp.ics.uci.edu/pub/machine-learning-databases. We will simulate 5,000 observations for this dataset. As mentioned earlier, the set.seed function guarantees reproducibility. Since we are solving binary classification problems, we will reduce the three classes generated by the waveform function to two, and then partition the data into training and testing parts for model building and testing purposes:

> library(mlbench)
> set.seed(123)
> Waveform <- mlbench.waveform(5000)
> table(Waveform$classes)
   1    2    3 
1687 1718 1595 
> Waveform$classes <- ifelse(Waveform$classes!=3,1,2)
> Waveform_DF <- data.frame(cbind(Waveform$x,Waveform$classes)) # Data Frame
> names(Waveform_DF) <- c(paste0("X",".",1:21),"Classes")
> Waveform_DF$Classes <- as.factor(Waveform_DF$Classes)
> table(Waveform_DF$Classes)
   1    2 
3405 1595 

The R function mlbench.waveform creates a new object of the mlbench class. Since it consists of two sub-parts in x and classes, we will convert it into data.frame following some further manipulations. The cbind function binds the two objects x (a matrix) and classes (a numeric vector) into a single matrix. The data.frame function converts the matrix object into a data frame, which is the class desired for the rest of the program.

After partitioning the data, we will create the required formula for the waveform dataset:

> set.seed(12345)
> Train_Test <- sample(c("Train","Test"),nrow(Waveform_DF),replace = TRUE,
+ prob = c(0.7,0.3))
> head(Train_Test)
[1] "Test"  "Test"  "Test"  "Test"  "Train" "Train"
> Waveform_DF_Train <- Waveform_DF[Train_Test=="Train",]
> Waveform_DF_TestX <- within(Waveform_DF[Train_Test=="Test",],rm(Classes))
> Waveform_DF_TestY <- Waveform_DF[Train_Test=="Test","Classes"]
> Waveform_DF_Formula <- as.formula("Classes~.")
主站蜘蛛池模板: 汝南县| 卓尼县| 梅河口市| 嘉善县| 布拖县| 库伦旗| 炎陵县| 工布江达县| 江西省| 旌德县| 鄱阳县| 永修县| 登封市| 清流县| 宝山区| 金坛市| 吴江市| 荣成市| 平乡县| 谷城县| 孙吴县| 绿春县| 嘉定区| 汉川市| 石屏县| 栾城县| 永春县| 应城市| 凤山县| 涿鹿县| 商水县| 青河县| 特克斯县| 东平县| 惠安县| 常山县| 开平市| 尼木县| 哈巴河县| 延庆县| 江都市|