官术网_书友最值得收藏!

Waveform

This dataset is an example of a simulation study. Here, we have twenty-one variables as input or independent variables, and a class variable referred to as classes. The data is generated using the mlbench.waveform function from the mlbench R package. For more details, refer to the following link: ftp://ftp.ics.uci.edu/pub/machine-learning-databases. We will simulate 5,000 observations for this dataset. As mentioned earlier, the set.seed function guarantees reproducibility. Since we are solving binary classification problems, we will reduce the three classes generated by the waveform function to two, and then partition the data into training and testing parts for model building and testing purposes:

> library(mlbench)
> set.seed(123)
> Waveform <- mlbench.waveform(5000)
> table(Waveform$classes)
   1    2    3 
1687 1718 1595 
> Waveform$classes <- ifelse(Waveform$classes!=3,1,2)
> Waveform_DF <- data.frame(cbind(Waveform$x,Waveform$classes)) # Data Frame
> names(Waveform_DF) <- c(paste0("X",".",1:21),"Classes")
> Waveform_DF$Classes <- as.factor(Waveform_DF$Classes)
> table(Waveform_DF$Classes)
   1    2 
3405 1595 

The R function mlbench.waveform creates a new object of the mlbench class. Since it consists of two sub-parts in x and classes, we will convert it into data.frame following some further manipulations. The cbind function binds the two objects x (a matrix) and classes (a numeric vector) into a single matrix. The data.frame function converts the matrix object into a data frame, which is the class desired for the rest of the program.

After partitioning the data, we will create the required formula for the waveform dataset:

> set.seed(12345)
> Train_Test <- sample(c("Train","Test"),nrow(Waveform_DF),replace = TRUE,
+ prob = c(0.7,0.3))
> head(Train_Test)
[1] "Test"  "Test"  "Test"  "Test"  "Train" "Train"
> Waveform_DF_Train <- Waveform_DF[Train_Test=="Train",]
> Waveform_DF_TestX <- within(Waveform_DF[Train_Test=="Test",],rm(Classes))
> Waveform_DF_TestY <- Waveform_DF[Train_Test=="Test","Classes"]
> Waveform_DF_Formula <- as.formula("Classes~.")
主站蜘蛛池模板: 吴江市| 乌拉特前旗| 红河县| 同仁县| 西平县| 达尔| 东山县| 织金县| 西藏| 高邑县| 兖州市| 延安市| 庆元县| 彭山县| 台东县| 辽中县| 兴国县| 阳西县| 凤庆县| 迭部县| 新闻| 延长县| 泗洪县| 广德县| 无为县| 屏南县| 长宁区| 铁岭县| 房产| 达尔| 易门县| 潮州市| 洪泽县| 工布江达县| 综艺| 云龙县| 焦作市| 蓬安县| 曲沃县| 田林县| 霍林郭勒市|