官术网_书友最值得收藏!

Datasets and modeling

We're going to be using two of the prior datasets, the simulated data from Chapter 4Advanced Feature Selection in Linear Models, and the customer satisfaction data from Chapter 3, Logistic Regression. We'll start by building a classification tree on the simulated data. This will help us to understand the basic principles of tree-based methods. Then, we'll move on to random forest and boosted trees applied to the customer satisfaction data. This exercise will provide an excellent comparison to the generalized linear models from before. Finally, I want to show you an interesting feature selection method using random forest, using the simulated data. By interesting, I mean it's a valuable technique to add to your feature selection arsenal, but I'll point out a couple of caveats for you to consider in practical application.

主站蜘蛛池模板: 灵石县| 平陆县| 阿瓦提县| 齐齐哈尔市| 紫云| 左贡县| 许昌市| 印江| 云和县| 遂宁市| 阳谷县| 灵武市| 亳州市| 彭阳县| 乌拉特后旗| 永州市| 堆龙德庆县| 石门县| 鞍山市| 桐庐县| 敖汉旗| 缙云县| 杭州市| 报价| 兴安县| 日喀则市| 呼和浩特市| 岐山县| 翁源县| 阜宁县| 潞西市| 大城县| 铜川市| 阿巴嘎旗| 宜城市| 东莞市| 酉阳| 东辽县| 延津县| 宁陵县| 嘉善县|