官术网_书友最值得收藏!

Summary

In this chapter, we looked at using probabilistic linear models to predict a qualitative response with two generalized linear model methods: logistic regression, and multivariate adaptive regression splines. We explored using the weight of information and information value as a technique to do univariate feature selection. We covered the concept of finding the proper probability threshold to minimize classification error. Additionally, we began the process of using various performance metrics such as AUC, log-loss, and ROC charts to explore model selection visually and statistically. These metrics proved to be more informative than just pure accuracy, especially in a situation where class labels are highly imbalanced. In the next chapter, we'll cover regularization methods for feature selection, and how it can be used in training your algorithms. We'll see how we can create a dataset. We'll know about ridge regression and dive deeper in feature selection.

主站蜘蛛池模板: 乌鲁木齐市| 天柱县| 那曲县| 济阳县| 浦北县| 班玛县| 南开区| 东辽县| 松溪县| 武陟县| 珠海市| 九龙坡区| 莱西市| 上栗县| 公安县| 新昌县| 本溪市| 邯郸市| 错那县| 邵阳市| 四平市| 阿克苏市| 雅江县| 顺昌县| 府谷县| 平邑县| 开鲁县| 南京市| 延长县| 通城县| 宣武区| 南阳市| 万宁市| 陇西县| 德阳市| 斗六市| 法库县| 祁阳县| 宝坻区| 团风县| 温州市|