- Deep Learning with R for Beginners
- Mark Hodnett Joshua F. Wiley Yuxi (Hayden) Liu Pablo Maldonado
- 754字
- 2021-06-24 14:30:45
Building a deep learning model
Now that we have covered the basics, let's look at building our first true deep learning model! We will use the UHI HAR dataset that we used in Chapter 2, Training a Prediction Model. The following code does some data preparation: it loads the data and selects only the columns that store mean values (those that have the word mean in the column name). The y variables are from 1 to 6; we will subtract one so that the range is 0 to 5. The code for this section is in Chapter4/uci_har.R. It requires the UHI HAR dataset to be in the data folder; download it from https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones and unzip it into the data folder:
train.x <- read.table("../data/UCI HAR Dataset/train/X_train.txt")
train.y <- read.table("../data/UCI HAR Dataset/train/y_train.txt")[[1]]
test.x <- read.table("../data/UCI HAR Dataset/test/X_test.txt")
test.y <- read.table("../data/UCI HAR Dataset/test/y_test.txt")[[1]]
features <- read.table("../data/UCI HAR Dataset/features.txt")
meanSD <- grep("mean\\(\\)|std\\(\\)", features[, 2])
train.y <- train.y-1
test.y <- test.y-1
Next, we will transpose the data and convert it into a matrix. MXNet expects the data to be width x height rather than height x width:
train.x <- t(train.x[,meanSD])
test.x <- t(test.x[,meanSD])
train.x <- data.matrix(train.x)
test.x <- data.matrix(test.x)
The next step is to define the computation graph. We create a placeholder for the data and create two fully connected (or dense) layers followed by relu activations. The first layer has 64 nodes and the second layer has 32 nodes. We create a final fully-connected layer with six nodes – the number of distinct classes in our y variable. We use a softmax activation to convert the numbers from the last six nodes into probabilities for each class:
data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=64)
act1 <- mx.symbol.Activation(fc1, name="relu1", act_type="relu")
fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=32)
act2 <- mx.symbol.Activation(fc2, name="relu2", act_type="relu")
fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=6)
softmax <- mx.symbol.SoftmaxOutput(fc3, name="sm")
When you run the previous code, nothing actually executes. To train the model, we create a devices object to indicate where the code should be run, CPU or GPU. Then you pass the symbol for last layer (softmax) into the mx.model.FeedForward.create function. This function has other parameters, which are more properly known as hyper-parameters. These include the epochs (num.round), which control how many times we pass through the data, the learning rate (learning.rate), which controls how much the gradients are updated during each pass, momentum (momentum), which is a hyper-parameter that can help the model to train faster, and the weights initializer (initializer), which controls how the weights and biases for nodes are initially set. We also pass in the evaluation metric (eval.metric),which is how the model is to be evaluated, and a callback function (epoch.end.callback), which is used to output progress information. When we run the function, it trains the model and outputs the progress as per the value we used for the epoch.end.callback parameter, namely every epoch:
devices <- mx.cpu()
mx.set.seed(0)
tic <- proc.time()
model <- mx.model.FeedForward.create(softmax, X = train.x, y = train.y,
ctx = devices,num.round = 20,
learning.rate = 0.08, momentum = 0.9,
eval.metric = mx.metric.accuracy,
initializer = mx.init.uniform(0.01),
epoch.end.callback =
mx.callback.log.train.metric(1))
Start training with 1 devices
[1] Train-accuracy=0.185581140350877
[2] Train-accuracy=0.26104525862069
[3] Train-accuracy=0.555091594827586
[4] Train-accuracy=0.519127155172414
[5] Train-accuracy=0.646551724137931
[6] Train-accuracy=0.733836206896552
[7] Train-accuracy=0.819100215517241
[8] Train-accuracy=0.881869612068966
[9] Train-accuracy=0.892780172413793
[10] Train-accuracy=0.908674568965517
[11] Train-accuracy=0.898572198275862
[12] Train-accuracy=0.896821120689655
[13] Train-accuracy=0.915544181034483
[14] Train-accuracy=0.928879310344828
[15] Train-accuracy=0.926993534482759
[16] Train-accuracy=0.934401939655172
[17] Train-accuracy=0.933728448275862
[18] Train-accuracy=0.934132543103448
[19] Train-accuracy=0.933324353448276
[20] Train-accuracy=0.934132543103448
print(proc.time() - tic)
user system elapsed
7.31 3.03 4.31
Now that we have trained our model, let's see how it does on the test set:
preds1 <- predict(model, test.x)
pred.label <- max.col(t(preds1)) - 1
t <- table(data.frame(cbind(test.y,pred.label)),
dnn=c("Actual", "Predicted"))
acc<-round(100.0*sum(diag(t))/length(test.y),2)
print(t)
Predicted
Actual 0 1 2 3 4 5
0 477 15 4 0 0 0
1 108 359 4 0 0 0
2 13 42 365 0 0 0
3 0 0 0 454 37 0
4 0 0 0 141 391 0
5 0 0 0 16 0 521
print(sprintf(" Deep Learning Model accuracy = %1.2f%%",acc))
[1] " Deep Learning Model accuracy = 87.11%"
Not bad! We have achieved an accuracy of 87.11% on our test set.
- Architects of Intelligence
- Spark大數(shù)據(jù)分析實戰(zhàn)
- 大數(shù)據(jù)可視化
- R數(shù)據(jù)科學(xué)實戰(zhàn):工具詳解與案例分析(鮮讀版)
- Live Longer with AI
- iOS and OS X Network Programming Cookbook
- 深度剖析Hadoop HDFS
- 大數(shù)據(jù)架構(gòu)和算法實現(xiàn)之路:電商系統(tǒng)的技術(shù)實戰(zhàn)
- R語言數(shù)據(jù)挖掘
- 辦公應(yīng)用與計算思維案例教程
- Hadoop集群與安全
- Filecoin原理與實現(xiàn)
- 中國云存儲發(fā)展報告
- 成功之路:ORACLE 11g學(xué)習(xí)筆記
- Access 2010數(shù)據(jù)庫應(yīng)用技術(shù)教程(第二版)