官术网_书友最值得收藏!

Advanced Feature Selection in Linear Models

"I found that math got to be too abstract for my liking and computer science seemed concerned with little details--trying to save a microsecond or a kilobyte in a computation. In statistics I found a subject that combined the beauty of both math and computer science, using them to solve real-world problems."

This was quoted by Rob Tibshirani, Professor, Stanford University at:

https://statweb.stanford.edu/~tibs/research_page.html.

So far, we've examined the usage of linear models for both quantitative and qualitative outcomes with an emphasis on the techniques of feature selection, that is, the methods and techniques to exclude useless or unwanted predictor variables. We saw that the linear models can be quite effective in machine learning problems. However, newer techniques that have been developed and refined in the last couple of decades or so can improve predictive ability and interpretability above and beyond the linear models that we discussed in the preceding chapters. In this day and age, many datasets have numerous features in relation to the number of observations or, as it is called, high-dimensionality. If you've ever worked on a genomics problem, this will quickly become self-evident. Additionally, with the size of the data that we are being asked to work with, a technique like best subsets or stepwise feature selection can take inordinate amounts of time to converge even on high-speed computers. I'm not talking about minutes: in many cases, hours of system time are required to get a best subsets solution.

There is a better way in these cases. In this chapter, we will look at the concept of regularization where the coefficients are constrained or shrunk towards zero. There are a number of methods and permutations to these methods of regularization but we will focus on Ridge regression, Least Absolute Shrinkage and Selection Operator (LASSO), and finally, elastic net, which combines the benefit of both techniques into one.

主站蜘蛛池模板: 西乌珠穆沁旗| 富蕴县| 丹阳市| 吉隆县| 邢台县| 和顺县| 游戏| 库尔勒市| 吉木乃县| 安宁市| 古丈县| 茶陵县| 娄烦县| 永和县| 融水| 伊川县| 克山县| 延边| 屯昌县| 漳州市| 安康市| 塔河县| 夹江县| 宝清县| 日土县| 仪征市| 开封县| 泸州市| 滨海县| 三门县| 嘉荫县| 峡江县| 襄樊市| 万宁市| 丹棱县| 祥云县| 略阳县| 察哈| 托克托县| 西乌珠穆沁旗| 安图县|