官术网_书友最值得收藏!

What this book covers

Here is a list of changes from the first edition by chapter:

Chapter 1, A process for success, has the flowchart redone to update an unintended typo and add additional methodologies.

Chapter 2, Linear Regression – the Blocking and Tackling of Machine Learning, has the code improved, and better charts have been provided; other than that, it remains relatively close to the original.

Chapter 3, Logistic Regression and Discriminant Analysis, has the code improved and streamlined. One of my favorite techniques, multivariate adaptive regression splines, has been added; it performs well, handles non-linearity, and is easy to explain. It is my base model, with others becoming "challengers" to try and outperform it.

Chapter 4, Advanced Feature Selection in Linear Models, has techniques not only for regression but also for a classification problem included.

Chapter 5, More Classification Techniques – K-Nearest Neighbors and Support Vector Machines, has the code streamlined and simplified.

Chapter 6, Classification and Regression Trees, has the addition of the very popular techniques provided by the XGBOOST package. Additionally, I added the technique of using random forest as a feature selection tool.

Chapter 7, Neural Networks and Deep Learning, has been updated with additional information on deep learning methods and has improved code for the H2O package, including hyper-parameter search.

Chapter 8, Cluster Analysis, has the methodology of doing unsupervised learning with random forests added.

Chapter 9, Principal Components Analysis, uses a different dataset, and an out-of-sample prediction has been added.

Chapter 10, Market Basket Analysis, Recommendation Engines, and Sequential Analysis, has the addition of sequential analysis, which, I'm discovering, is more and more important, especially in marketing.

Chapter 11, Creating Ensembles and Multiclass Classification, has completely new content, using several great packages.

Chapter 12, Time Series and Causality, has a couple of additional years of climate data added, along with a demonstration of different methods of causality test.

Chapter 13, Text Mining, has additional data and improved code.

Chapter 14, R on the Cloud, is another chapter of new content, allowing you to get R on the cloud, simply and quickly.

Appendix A, R Fundamentals, has additional data manipulation methods.
Appendix B, Sources, has a list of sources and references.

主站蜘蛛池模板: 遵义县| 台东县| 荃湾区| 锡林浩特市| 谷城县| 玉环县| 江城| 宁远县| 桐庐县| 宜兰县| 湘潭市| 阿鲁科尔沁旗| 紫云| 新郑市| 综艺| 邵阳县| 基隆市| 报价| 临颍县| 望江县| 修文县| 瓦房店市| 富源县| 唐山市| 什邡市| 河东区| 孝义市| 体育| 武乡县| 江西省| 西平县| 东方市| 靖远县| 中阳县| 宣武区| 宁夏| 随州市| 泽普县| 东乌| 广水市| 高尔夫|