官术网_书友最值得收藏!

Waveform

This dataset is an example of a simulation study. Here, we have twenty-one variables as input or independent variables, and a class variable referred to as classes. The data is generated using the mlbench.waveform function from the mlbench R package. For more details, refer to the following link: ftp://ftp.ics.uci.edu/pub/machine-learning-databases. We will simulate 5,000 observations for this dataset. As mentioned earlier, the set.seed function guarantees reproducibility. Since we are solving binary classification problems, we will reduce the three classes generated by the waveform function to two, and then partition the data into training and testing parts for model building and testing purposes:

> library(mlbench)
> set.seed(123)
> Waveform <- mlbench.waveform(5000)
> table(Waveform$classes)
   1    2    3 
1687 1718 1595 
> Waveform$classes <- ifelse(Waveform$classes!=3,1,2)
> Waveform_DF <- data.frame(cbind(Waveform$x,Waveform$classes)) # Data Frame
> names(Waveform_DF) <- c(paste0("X",".",1:21),"Classes")
> Waveform_DF$Classes <- as.factor(Waveform_DF$Classes)
> table(Waveform_DF$Classes)
   1    2 
3405 1595 

The R function mlbench.waveform creates a new object of the mlbench class. Since it consists of two sub-parts in x and classes, we will convert it into data.frame following some further manipulations. The cbind function binds the two objects x (a matrix) and classes (a numeric vector) into a single matrix. The data.frame function converts the matrix object into a data frame, which is the class desired for the rest of the program.

After partitioning the data, we will create the required formula for the waveform dataset:

> set.seed(12345)
> Train_Test <- sample(c("Train","Test"),nrow(Waveform_DF),replace = TRUE,
+ prob = c(0.7,0.3))
> head(Train_Test)
[1] "Test"  "Test"  "Test"  "Test"  "Train" "Train"
> Waveform_DF_Train <- Waveform_DF[Train_Test=="Train",]
> Waveform_DF_TestX <- within(Waveform_DF[Train_Test=="Test",],rm(Classes))
> Waveform_DF_TestY <- Waveform_DF[Train_Test=="Test","Classes"]
> Waveform_DF_Formula <- as.formula("Classes~.")
主站蜘蛛池模板: 武鸣县| 稻城县| 安国市| 新绛县| 吉安市| 海伦市| 当涂县| 综艺| 洛川县| 松溪县| 花莲市| 高要市| 遂昌县| 岳普湖县| 六枝特区| 厦门市| 五峰| 佛教| 和田县| 利辛县| 云阳县| 大姚县| 辽阳县| 商丘市| 大关县| 青田县| 额尔古纳市| 苍溪县| 绥江县| 永福县| 峨眉山市| 广南县| 双流县| 舞阳县| 赤壁市| 济源市| 上杭县| 新民市| 淮阳县| 乌鲁木齐县| 岑巩县|