官术网_书友最值得收藏!

Summary

In this chapter, we looked at using probabilistic linear models to predict a qualitative response with two generalized linear model methods: logistic regression, and multivariate adaptive regression splines. We explored using the weight of information and information value as a technique to do univariate feature selection. We covered the concept of finding the proper probability threshold to minimize classification error. Additionally, we began the process of using various performance metrics such as AUC, log-loss, and ROC charts to explore model selection visually and statistically. These metrics proved to be more informative than just pure accuracy, especially in a situation where class labels are highly imbalanced. In the next chapter, we'll cover regularization methods for feature selection, and how it can be used in training your algorithms. We'll see how we can create a dataset. We'll know about ridge regression and dive deeper in feature selection.

主站蜘蛛池模板: 青河县| 古田县| 德兴市| 九龙城区| 绥滨县| 常州市| 廊坊市| 噶尔县| 溆浦县| 木里| 乐昌市| 大悟县| 丰顺县| 津南区| 河南省| 桃源县| 屯昌县| 伊金霍洛旗| 合川市| 宕昌县| 合作市| 泾川县| 炎陵县| 阿荣旗| 巩留县| 宁河县| 小金县| 瑞安市| 喀喇| 廊坊市| 新源县| 永济市| 信丰县| 黔西| 北票市| 华池县| 墨竹工卡县| 渝中区| 正安县| 马山县| 东山县|