- Advanced Machine Learning with R
- Cory Lesmeister Dr. Sunil Kumar Chinnamgari
- 335字
- 2021-06-24 14:24:38
Model comparison
A useful tool for a classification model comparison is the Receiver Operating Characteristic (ROC) chart. ROC is a technique for visualizing, organizing, and selecting classifiers based on their performance (Fawcett, 2006). On the ROC chart, the y axis is the True Positive Rate (TPR), and the x axis is the False Positive Rate (FPR).
To create a ROC chart in R, you can use the ROCR package. I think this is a great package and allows you to build a chart in just three lines of code. The package also has an excellent companion website (with examples and a presentation) that can be found at the following link: http://rocr.bioinf.mpi-sb.mpg.de/.
For each model, you create a prediction object of the actual labels and the predicted probabilities, then create a performance object that embeds TPR and FPR, and finally plot it:
> pred.glm <- ROCR::prediction(glm_test_pred$one, test$y)
> perf.glm <- ROCR::performance(pred.glm, "tpr", "fpr")
> ROCR::plot(perf.glm, main = "ROC", col = 1)
That gives us the plot for the GLM (logistic regression). Now, we'll superimpose the MARS model on the same plot and create a legend:
> pred.earth <- ROCR::prediction(test_pred, test$y)
> perf.earth <- ROCR::performance(pred.earth, "tpr", "fpr")
> ROCR::plot(perf.earth, col = 2, add = TRUE)
> legend(0.6, 0.6, c("GLM", "MARS"), 1:2)
The output of the preceding code is as follows:

The area under the ROC curves corresponds to the prior calculated AUCs. The MARs model had a higher AUC; hence, its curve is slightly higher than the GLM model. It's noteworthy that around a TPR of 0.5, they have almost the same FPR. The bottom line though is the MARS model with fewer input features outperformed logistic regression albeit just slightly.
In a problem such as that which this data provides, there are quite a few things we could do to increase performance. You could further explore the data to try and add custom features. You could also use more advanced methods, creating more models for comparison, or even build several models and create an ensemble. As for advanced techniques and building ensembles, we'll cover those in subsequent chapters. Let your imaginations run wild!
- 新媒體跨界交互設(shè)計(jì)
- 零點(diǎn)起飛學(xué)Xilinx FPG
- 數(shù)字道路技術(shù)架構(gòu)與建設(shè)指南
- 計(jì)算機(jī)應(yīng)用與維護(hù)基礎(chǔ)教程
- 現(xiàn)代辦公設(shè)備使用與維護(hù)
- 單片機(jī)原理及應(yīng)用系統(tǒng)設(shè)計(jì)
- 電腦維護(hù)365問
- 分布式系統(tǒng)與一致性
- Apple Motion 5 Cookbook
- 基于Apache Kylin構(gòu)建大數(shù)據(jù)分析平臺
- Practical Machine Learning with R
- 單片機(jī)系統(tǒng)設(shè)計(jì)與開發(fā)教程
- Java Deep Learning Cookbook
- 微控制器的應(yīng)用
- Mastering Quantum Computing with IBM QX