官术网_书友最值得收藏!

ROC test

The ROC curve is an important improvement on the false positive and true negative measures of model performance. For a detailed explanation, refer to Chapter 9 of Tattar et al. (2017). The ROC curve basically plots the true positive rate against the false positive rate, and we measure the AUC for the fitted model.

The main goal that the ROC test attempts to achieve is the following. Suppose that Model 1 gives an AUC of 0.89 and Model 2 gives 0.91. Using the simple AUC criteria, we outright conclude that Model 2 is better than Model 1. However, an important question that arises is whether 0.91 is significantly higher than 0.89. The roc.test, from the pROC R package, provides the answer here. For the neural network and classification tree, the following R segment gives the required answer:

> library(pROC)
> HT_NN_Prob <- predict(NN_fit,newdata=HT2_TestX,type="raw")
> HT_NN_roc <- roc(HT2_TestY,c(HT_NN_Prob))
> HT_NN_roc$auc
Area under the curve: 0.9723826
> HT_CT_Prob <- predict(CT_fit,newdata=HT2_TestX,type="prob")[,2]
> HT_CT_roc <- roc(HT2_TestY,HT_CT_Prob)
> HT_CT_roc$auc
Area under the curve: 0.9598765
> roc.test(HT_NN_roc,HT_CT_roc)
        DeLong's test for two correlated ROC curves
data:  HT_NN_roc and HT_CT_roc
Z = 0.72452214, p-value = 0.4687452
alternative hypothesis: true difference in AUC is not equal to 0
sample estimates:
 AUC of roc1  AUC of roc2 
0.9723825557 0.9598765432 

Since the p-value is very large, we conclude that the AUC for the two models is not significantly different.

Statistical tests are vital and we recommend that they be used whenever suitable. The concepts highlighted in this chapter will be drawn on in more detail in the rest of the book.

主站蜘蛛池模板: 天镇县| 陈巴尔虎旗| 江北区| 天柱县| 白银市| 全南县| 冷水江市| 油尖旺区| 紫云| 阿瓦提县| 托里县| 东乡| 中山市| 县级市| 寿光市| 安福县| 延吉市| 永德县| 神农架林区| 长垣县| 南和县| 武汉市| 黎城县| 江安县| 华安县| 洛浦县| 湖北省| 文成县| 大新县| 阜平县| 江源县| 余江县| 潼南县| 玉门市| 洮南市| 瓦房店市| 文成县| 壶关县| 车致| 博野县| 松桃|