官术网_书友最值得收藏!

How it works...

The performance of the model can be assessed using many metrics such as accuracy, Area under curve (AUC), misclassification error (%), misclassification error count, F1-score, precision, recall, specificity, and so on. However, in this chapter, the assessment of model performance is based on AUC.

The following is the training and cross validation accuracy of the trained model:

# Training accuracy (AUC)
> occupancy_train.glm@model$training_metrics@metrics$AUC
[1] 0.994583

# Cross validation accuracy (AUC)
> occupancy_train.glm@model$cross_validation_metrics@metrics$AUC
[1] 0.9945057

Now, let's assess the performance of the model on test data. The following code helps in predicting the outcome of the test data:

# Predict on test data
yhat <- h2o.predict(occupancy_train.glm, occupancy_test.hex)

Then, evaluate the AUC value based on the actual test outcome as follows:

# Test accuracy (AUC)
> yhat$pmax <- pmax(yhat$p0, yhat$p1, na.rm = TRUE)
> roc_obj <- pROC::roc(c(as.matrix(occupancy_test.hex$Occupancy)),
c(as.matrix(yhat$pmax)))
> auc(roc_obj)
Area under the curve: 0.9915

In H2O, one can also compute variable importance from the GLM model, as shown in the figure following this command:

#compute variable importance and performance
h2o.varimp_plot(occupancy_train.glm, num_of_features = 5)
Variable importance using H2O
主站蜘蛛池模板: 东乌珠穆沁旗| 手游| 莱阳市| 安泽县| 施甸县| 蓬溪县| 大田县| 托克逊县| 措美县| 阳朔县| 泗洪县| 神池县| 龙陵县| 阿坝| 湘潭县| 康保县| 荆门市| 平安县| 门源| 太仓市| 鹰潭市| 宜宾县| 德钦县| 海兴县| 泰兴市| 蕉岭县| 集贤县| 中牟县| 柳江县| 壤塘县| 明星| 陈巴尔虎旗| 治县。| 新密市| 监利县| 寿光市| 永川市| 琼海市| 昆山市| 布尔津县| 康定县|