官术网_书友最值得收藏!

Summary

In this chapter, we looked at using probabilistic linear models to predict a qualitative response with two generalized linear model methods: logistic regression, and multivariate adaptive regression splines. We explored using the weight of information and information value as a technique to do univariate feature selection. We covered the concept of finding the proper probability threshold to minimize classification error. Additionally, we began the process of using various performance metrics such as AUC, log-loss, and ROC charts to explore model selection visually and statistically. These metrics proved to be more informative than just pure accuracy, especially in a situation where class labels are highly imbalanced. In the next chapter, we'll cover regularization methods for feature selection, and how it can be used in training your algorithms. We'll see how we can create a dataset. We'll know about ridge regression and dive deeper in feature selection.

主站蜘蛛池模板: 英山县| 陈巴尔虎旗| 加查县| 长宁县| 中卫市| 武强县| 岱山县| 溆浦县| 延川县| 莆田市| 湛江市| 太原市| 久治县| 永平县| 台北县| 彭山县| 阳朔县| 光山县| 栾川县| 汤阴县| 秀山| 榆中县| 永州市| 收藏| 会泽县| 阜宁县| 吉隆县| 龙岩市| 定远县| 红河县| 兴和县| 深圳市| 渝中区| 敦化市| 宁远县| 广河县| 武乡县| 五河县| 册亨县| 涞源县| 阜城县|