- Deep Learning with R for Beginners
- Mark Hodnett Joshua F. Wiley Yuxi (Hayden) Liu Pablo Maldonado
- 637字
- 2021-06-24 14:30:45
The regression model
The previous section developed a deep learning model for a binary classification task, this section develops a deep learning model to predict a continuous numeric value, regression analysis. We use the same dataset that we used for the binary classification task, but we use a different target column to predict for. In that task, we wanted to predict whether a customer would return to our stores in the next 14 days. In this task, we want to predict how much a customer will spend in our stores in the next 14 days. We follow a similar process; we load and prepare our dataset by applying log transformations to the data. The code is in Chapter4/regression.R:
set.seed(42)
fileName <- "../dunnhumby/predict.csv"
dfData <- read_csv(fileName,
col_types = cols(
.default = col_double(),
CUST_CODE = col_character(),
Y_categ = col_integer())
)
nobs <- nrow(dfData)
train <- sample(nobs, 0.9*nobs)
test <- setdiff(seq_len(nobs), train)
predictorCols <- colnames(dfData)[!(colnames(dfData) %in% c("CUST_CODE","Y_numeric","Y_numeric"))]
dfData[, c("Y_numeric",predictorCols)] <- log(0.01+dfData[, c("Y_numeric",predictorCols)])
trainData <- dfData[train, c(predictorCols,"Y_numeric")]
testData <- dfData[test, c(predictorCols,"Y_numeric")]
xtrain <- model.matrix(Y_numeric~.,trainData)
xtest <- model.matrix(Y_numeric~.,testData)
We then perform regression analysis on the data using lm to create a benchmark before creating a deep learning model:
# lm Regression Model
regModel1=lm(Y_numeric ~ .,data=trainData)
pr1 <- predict(regModel1,testData)
rmse <- sqrt(mean((exp(pr1)-exp(testData[,"Y_numeric"]$Y_numeric))^2))
print(sprintf(" Regression RMSE = %1.2f",rmse))
[1] " Regression RMSE = 29.30"
mae <- mean(abs(exp(pr1)-exp(testData[,"Y_numeric"]$Y_numeric)))
print(sprintf(" Regression MAE = %1.2f",mae))
[1] " Regression MAE = 13.89"
We output two metrics, rmse and mae, for our regression task. We covered these earlier in the chapter. Mean absolute error measures the absolute differences between the predicted value and the actual value. Root mean squared error (rmse) penalizes the square of the differences between the predicted value and the actual value, so one big error costs more than the sum of the small errors. Now let's look at the deep learning regression code. First we load the data and define the model:
require(mxnet)
Loading required package: mxnet
# MXNet expects matrices
train_X <- data.matrix(trainData[, predictorCols])
test_X <- data.matrix(testData[, predictorCols])
train_Y <- trainData$Y_numeric
set.seed(42)
# hyper-parameters
num_hidden <- c(256,128,128,64)
drop_out <- c(0.4,0.4,0.4,0.4)
wd=0.00001
lr <- 0.0002
num_epochs <- 100
activ <- "tanh"
# create our model architecture
# using the hyper-parameters defined above
data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=num_hidden[1])
act1 <- mx.symbol.Activation(fc1, name="activ1", act_type=activ)
drop1 <- mx.symbol.Dropout(data=act1,p=drop_out[1])
fc2 <- mx.symbol.FullyConnected(drop1, name="fc2", num_hidden=num_hidden[2])
act2 <- mx.symbol.Activation(fc2, name="activ2", act_type=activ)
drop2 <- mx.symbol.Dropout(data=act2,p=drop_out[2])
fc3 <- mx.symbol.FullyConnected(drop2, name="fc3", num_hidden=num_hidden[3])
act3 <- mx.symbol.Activation(fc3, name="activ3", act_type=activ)
drop3 <- mx.symbol.Dropout(data=act3,p=drop_out[3])
fc4 <- mx.symbol.FullyConnected(drop3, name="fc4", num_hidden=num_hidden[4])
act4 <- mx.symbol.Activation(fc4, name="activ4", act_type=activ)
drop4 <- mx.symbol.Dropout(data=act4,p=drop_out[4])
fc5 <- mx.symbol.FullyConnected(drop4, name="fc5", num_hidden=1)
lro <- mx.symbol.LinearRegressionOutput(fc5)
Now we train the model; note that the first comment shows how to switch to using a GPU instead of a CPU:
# run on cpu, change to 'devices <- mx.gpu()'
# if you have a suitable GPU card
devices <- mx.cpu()
mx.set.seed(0)
tic <- proc.time()
# This actually trains the model
model <- mx.model.FeedForward.create(lro, X = train_X, y = train_Y,
ctx = devices,num.round = num_epochs,
learning.rate = lr, momentum = 0.9,
eval.metric = mx.metric.rmse,
initializer = mx.init.uniform(0.1),
wd=wd,
epoch.end.callback = mx.callback.log.train.metric(1))
print(proc.time() - tic)
user system elapsed
13.90 1.82 10.50
pr4 <- predict(model, test_X)[1,]
rmse <- sqrt(mean((exp(pr4)-exp(testData[,"Y_numeric"]$Y_numeric))^2))
print(sprintf(" Deep Learning Regression RMSE = %1.2f",rmse))
[1] " Deep Learning Regression RMSE = 28.92"
mae <- mean(abs(exp(pr4)-exp(testData[,"Y_numeric"]$Y_numeric)))
print(sprintf(" Deep Learning Regression MAE = %1.2f",mae))
[1] " Deep Learning Regression MAE = 14.33"
rm(data,fc1,act1,fc2,act2,fc3,act3,fc4,lro,model)
For regression metrics, lower is better, so our rmse metric on the deep learning model (28.92) is an improvement on the original regression model (29.30). Interestingly, the mae on the the deep learning model (14.33) is actually worse than the original regression model (13.89). Since rsme penalizes big differences between actual and predicted values more, this indicates that the errors in the deep learning model are less extreme than the regression model.
- Google Visualization API Essentials
- 深入淺出MySQL:數(shù)據(jù)庫開發(fā)、優(yōu)化與管理維護(第2版)
- 大數(shù)據(jù)時代下的智能轉(zhuǎn)型進程精選(套裝共10冊)
- Creating Dynamic UIs with Android Fragments(Second Edition)
- Oracle PL/SQL實例精解(原書第5版)
- AI時代的數(shù)據(jù)價值創(chuàng)造:從數(shù)據(jù)底座到大模型應(yīng)用落地
- Unity 2018 By Example(Second Edition)
- 數(shù)據(jù)挖掘競賽實戰(zhàn):方法與案例
- 信息融合中估計算法的性能評估
- 大數(shù)據(jù)分析:R基礎(chǔ)及應(yīng)用
- Access 2010數(shù)據(jù)庫應(yīng)用技術(shù)教程(第二版)
- Access 2013 數(shù)據(jù)庫管理與應(yīng)用從新手到高手
- Access 2007數(shù)據(jù)庫應(yīng)用
- Oracle數(shù)據(jù)庫性能優(yōu)化方法論和最佳實踐
- 云存儲安全:大數(shù)據(jù)分析與計算的基石