官术网_书友最值得收藏!

Building a deep learning model

Now that we have covered the basics, let's look at building our first true deep learning model! We will use the UHI HAR dataset that we used in Chapter 2, Training a Prediction Model. The following code does some data preparation: it loads the data and selects only the columns that store mean values (those that have the word mean in the column name). The y variables are from 1 to 6; we will subtract one so that the range is 0 to 5. The code for this section is in Chapter4/uci_har.R. It requires the UHI HAR dataset to be in the data folder; download it from https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones and unzip it into the data folder:

train.x <- read.table("../data/UCI HAR Dataset/train/X_train.txt")
train.y <- read.table("../data/UCI HAR Dataset/train/y_train.txt")[[1]]
test.x <- read.table("../data/UCI HAR Dataset/test/X_test.txt")
test.y <- read.table("../data/UCI HAR Dataset/test/y_test.txt")[[1]]
features <- read.table("../data/UCI HAR Dataset/features.txt")
meanSD <- grep("mean\\(\\)|std\\(\\)", features[, 2])
train.y <- train.y-1
test.y <- test.y-1

Next, we will transpose the data and convert it into a matrix. MXNet expects the data to be width x height rather than height x width:

train.x <- t(train.x[,meanSD])
test.x <- t(test.x[,meanSD])
train.x <- data.matrix(train.x)
test.x <- data.matrix(test.x)

The next step is to define the computation graph. We create a placeholder for the data and create two fully connected (or dense) layers followed by relu activations. The first layer has 64 nodes and the second layer has 32 nodes. We create a final fully-connected layer with six nodes – the number of distinct classes in our y variable. We use a softmax activation to convert the numbers from the last six nodes into probabilities for each class:

data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=64)
act1 <- mx.symbol.Activation(fc1, name="relu1", act_type="relu")
fc2 <- mx.symbol.FullyConnected(act1, name="fc2", num_hidden=32)
act2 <- mx.symbol.Activation(fc2, name="relu2", act_type="relu")
fc3 <- mx.symbol.FullyConnected(act2, name="fc3", num_hidden=6)
softmax <- mx.symbol.SoftmaxOutput(fc3, name="sm")

When you run the previous code, nothing actually executes. To train the model, we create a devices object to indicate where the code should be run, CPU or GPU. Then you pass the symbol for last layer (softmax) into the mx.model.FeedForward.create function. This function has other parameters, which are more properly known as hyper-parameters. These include the epochs (num.round), which control how many times we pass through the data, the learning rate (learning.rate), which controls how much the gradients are updated during each pass, momentum (momentum), which is a hyper-parameter that can help the model to train faster, and the weights initializer (initializer), which controls how the weights and biases for nodes are initially set. We also pass in the evaluation metric (eval.metric),which is how the model is to be evaluated, and a callback function (epoch.end.callback), which is used to output progress information. When we run the function, it trains the model and outputs the progress as per the value we used for the epoch.end.callback parameter, namely every epoch:

devices <- mx.cpu()
mx.set.seed(0)
tic <- proc.time()
model <- mx.model.FeedForward.create(softmax, X = train.x, y = train.y,
ctx = devices,num.round = 20,
learning.rate = 0.08, momentum = 0.9,
eval.metric = mx.metric.accuracy,
initializer = mx.init.uniform(0.01),
epoch.end.callback =
mx.callback.log.train.metric(1))
Start training with 1 devices
[1] Train-accuracy=0.185581140350877
[2] Train-accuracy=0.26104525862069
[3] Train-accuracy=0.555091594827586
[4] Train-accuracy=0.519127155172414
[5] Train-accuracy=0.646551724137931
[6] Train-accuracy=0.733836206896552
[7] Train-accuracy=0.819100215517241
[8] Train-accuracy=0.881869612068966
[9] Train-accuracy=0.892780172413793
[10] Train-accuracy=0.908674568965517
[11] Train-accuracy=0.898572198275862
[12] Train-accuracy=0.896821120689655
[13] Train-accuracy=0.915544181034483
[14] Train-accuracy=0.928879310344828
[15] Train-accuracy=0.926993534482759
[16] Train-accuracy=0.934401939655172
[17] Train-accuracy=0.933728448275862
[18] Train-accuracy=0.934132543103448
[19] Train-accuracy=0.933324353448276
[20] Train-accuracy=0.934132543103448
print(proc.time() - tic)
user system elapsed
7.31 3.03 4.31

Now that we have trained our model, let's see how it does on the test set:


preds1 <- predict(model, test.x)
pred.label <- max.col(t(preds1)) - 1
t <- table(data.frame(cbind(test.y,pred.label)),
dnn=c("Actual", "Predicted"))
acc<-round(100.0*sum(diag(t))/length(test.y),2)
print(t)
Predicted
Actual 0 1 2 3 4 5
0 477 15 4 0 0 0
1 108 359 4 0 0 0
2 13 42 365 0 0 0
3 0 0 0 454 37 0
4 0 0 0 141 391 0
5 0 0 0 16 0 521
print(sprintf(" Deep Learning Model accuracy = %1.2f%%",acc))
[1] " Deep Learning Model accuracy = 87.11%"

Not bad! We have achieved an accuracy of 87.11% on our test set.

Wait, where are the backward propagation, derivatives, and so on, that we covered in previous chapters? The answer to that is deep learning libraries largely manage this automatically for you. In MXNet, automatic differentiation is included in a package called the autograd package, which differentiates a graph of operations with the chain rule. It is one less thing to worry about when building deep learning models. For more information, go to  https://mxnet.incubator.apache.org/tutorials/gluon/autograd.html.
主站蜘蛛池模板: 西安市| 旬阳县| 孟津县| 阿鲁科尔沁旗| 南宫市| 沙坪坝区| 论坛| 吉木萨尔县| 双鸭山市| 十堰市| 钟祥市| 德格县| 牙克石市| 凤冈县| 福贡县| 青河县| 荔波县| 高陵县| 县级市| 龙陵县| 珲春市| 祁门县| 本溪市| 密山市| 迁西县| 卓尼县| 高雄市| 建水县| 台山市| 灵山县| 射洪县| 上饶市| 宜兴市| 宝山区| 留坝县| 大余县| 丹棱县| 阳曲县| 湘乡市| 禄劝| 松原市|