官术网_书友最值得收藏!

Building a deep learning model

Now that we have covered the basics, let's look at building our first true deep learning model! We will use the UHI HAR dataset that we used in Chapter 2, Training a Prediction Model. The following code does some data preparation: it loads the data and selects only the columns that store mean values (those that have the word mean in the column name). The y variables are from 1 to 6; we will subtract one so that the range is 0 to 5. The code for this section is in Chapter4/uci_har.R. It requires the UHI HAR dataset to be in the data folder; download it from https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones and unzip it into the data folder:

train.x <- read.table("../data/UCI HAR Dataset/train/X_train.txt")
train.y <- read.table("../data/UCI HAR Dataset/train/y_train.txt")[[1]]
test.x <- read.table("../data/UCI HAR Dataset/test/X_test.txt")
test.y <- read.table("../data/UCI HAR Dataset/test/y_test.txt")[[1]]
features <- read.table("../data/UCI HAR Dataset/features.txt")
meanSD <- grep("mean\\(\\)|std\\(\\)", features[, 2])
train.y <- train.y-1
test.y <- test.y-1

Next, we will transpose the data and convert it into a matrix. MXNet expects the data to be width x height rather than height x width:

train.x <- t(train.x[,meanSD])
test.x <- t(test.x[,meanSD])
train.x <- data.matrix(train.x)
test.x <- data.matrix(test.x)

The next step is to define the computation graph. We create a placeholder for the data and create two fully connected (or dense) layers followed by relu activations. The first layer has 64 nodes and the second layer has 32 nodes. We create a final fully-connected layer with six nodes – the number of distinct classes in our y variable. We use a softmax activation to convert the numbers from the last six nodes into probabilities for each class:

data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=64)
act1 <- mx.symbol.Activation(fc1, name="relu1", act_type="relu")
fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=32)
act2 <- mx.symbol.Activation(fc2, name="relu2", act_type="relu")
fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=6)
softmax <- mx.symbol.SoftmaxOutput(fc3, name="sm")

When you run the previous code, nothing actually executes. To train the model, we create a devices object to indicate where the code should be run, CPU or GPU. Then you pass the symbol for last layer (softmax) into the mx.model.FeedForward.create function. This function has other parameters, which are more properly known as hyper-parameters. These include the epochs (num.round), which control how many times we pass through the data, the learning rate (learning.rate), which controls how much the gradients are updated during each pass, momentum (momentum), which is a hyper-parameter that can help the model to train faster, and the weights initializer (initializer), which controls how the weights and biases for nodes are initially set. We also pass in the evaluation metric (eval.metric),which is how the model is to be evaluated, and a callback function (epoch.end.callback), which is used to output progress information. When we run the function, it trains the model and outputs the progress as per the value we used for the epoch.end.callback parameter, namely every epoch:

devices <- mx.cpu()
mx.set.seed(0)
tic <- proc.time()
model <- mx.model.FeedForward.create(softmax, X = train.x, y = train.y,
ctx = devices,num.round = 20,
learning.rate = 0.08, momentum = 0.9,
eval.metric = mx.metric.accuracy,
initializer = mx.init.uniform(0.01),
epoch.end.callback =
mx.callback.log.train.metric(1))
Start training with 1 devices
[1] Train-accuracy=0.185581140350877
[2] Train-accuracy=0.26104525862069
[3] Train-accuracy=0.555091594827586
[4] Train-accuracy=0.519127155172414
[5] Train-accuracy=0.646551724137931
[6] Train-accuracy=0.733836206896552
[7] Train-accuracy=0.819100215517241
[8] Train-accuracy=0.881869612068966
[9] Train-accuracy=0.892780172413793
[10] Train-accuracy=0.908674568965517
[11] Train-accuracy=0.898572198275862
[12] Train-accuracy=0.896821120689655
[13] Train-accuracy=0.915544181034483
[14] Train-accuracy=0.928879310344828
[15] Train-accuracy=0.926993534482759
[16] Train-accuracy=0.934401939655172
[17] Train-accuracy=0.933728448275862
[18] Train-accuracy=0.934132543103448
[19] Train-accuracy=0.933324353448276
[20] Train-accuracy=0.934132543103448
print(proc.time() - tic)
user system elapsed
7.31 3.03 4.31

Now that we have trained our model, let's see how it does on the test set:


preds1 <- predict(model, test.x)
pred.label <- max.col(t(preds1)) - 1
t <- table(data.frame(cbind(test.y,pred.label)),
dnn=c("Actual", "Predicted"))
acc<-round(100.0*sum(diag(t))/length(test.y),2)
print(t)
Predicted
Actual 0 1 2 3 4 5
0 477 15 4 0 0 0
1 108 359 4 0 0 0
2 13 42 365 0 0 0
3 0 0 0 454 37 0
4 0 0 0 141 391 0
5 0 0 0 16 0 521
print(sprintf(" Deep Learning Model accuracy = %1.2f%%",acc))
[1] " Deep Learning Model accuracy = 87.11%"

Not bad! We have achieved an accuracy of 87.11% on our test set.

Wait, where are the backward propagation, derivatives, and so on, that we covered in previous chapters? The answer to that is deep learning libraries largely manage this automatically for you. In MXNet, automatic differentiation is included in a package called the autograd package, which differentiates a graph of operations with the chain rule. It is one less thing to worry about when building deep learning models. For more information, go to  https://mxnet.incubator.apache.org/tutorials/gluon/autograd.html.
主站蜘蛛池模板: 泗洪县| 安泽县| 昌吉市| 安泽县| 玉树县| 平武县| 威远县| 长沙县| 沙洋县| 台前县| 图木舒克市| 凤台县| 汶川县| 偃师市| 隆德县| 隆回县| 富源县| 敦化市| 鲜城| 临夏县| 东台市| 贞丰县| 乌兰浩特市| 南川市| 嘉义县| 诸城市| 桂林市| 荣成市| 施甸县| 云浮市| 米易县| 册亨县| 宁明县| 吉木萨尔县| 鄂温| 碌曲县| 昔阳县| 绥芬河市| 林西县| 嘉兴市| 铜鼓县|