官术网_书友最值得收藏!

What this book covers

Here is a list of changes from the first edition by chapter:

Chapter 1, A process for success, has the flowchart redone to update an unintended typo and add additional methodologies.

Chapter 2, Linear Regression – the Blocking and Tackling of Machine Learning, has the code improved, and better charts have been provided; other than that, it remains relatively close to the original.

Chapter 3, Logistic Regression and Discriminant Analysis, has the code improved and streamlined. One of my favorite techniques, multivariate adaptive regression splines, has been added; it performs well, handles non-linearity, and is easy to explain. It is my base model, with others becoming "challengers" to try and outperform it.

Chapter 4, Advanced Feature Selection in Linear Models, has techniques not only for regression but also for a classification problem included.

Chapter 5, More Classification Techniques – K-Nearest Neighbors and Support Vector Machines, has the code streamlined and simplified.

Chapter 6, Classification and Regression Trees, has the addition of the very popular techniques provided by the XGBOOST package. Additionally, I added the technique of using random forest as a feature selection tool.

Chapter 7, Neural Networks and Deep Learning, has been updated with additional information on deep learning methods and has improved code for the H2O package, including hyper-parameter search.

Chapter 8, Cluster Analysis, has the methodology of doing unsupervised learning with random forests added.

Chapter 9, Principal Components Analysis, uses a different dataset, and an out-of-sample prediction has been added.

Chapter 10, Market Basket Analysis, Recommendation Engines, and Sequential Analysis, has the addition of sequential analysis, which, I'm discovering, is more and more important, especially in marketing.

Chapter 11, Creating Ensembles and Multiclass Classification, has completely new content, using several great packages.

Chapter 12, Time Series and Causality, has a couple of additional years of climate data added, along with a demonstration of different methods of causality test.

Chapter 13, Text Mining, has additional data and improved code.

Chapter 14, R on the Cloud, is another chapter of new content, allowing you to get R on the cloud, simply and quickly.

Appendix A, R Fundamentals, has additional data manipulation methods.
Appendix B, Sources, has a list of sources and references.

主站蜘蛛池模板: 白水县| 睢宁县| 丹凤县| 阿克| 桃园市| 常州市| 柘城县| 桓仁| 五河县| 福州市| 双柏县| 屏边| 竹北市| 永定县| 仙居县| 肇庆市| 合水县| 承德市| 怀远县| 社旗县| 灵川县| 岳阳市| 东宁县| 手游| 昔阳县| 务川| 高台县| 阆中市| 乐安县| 天津市| 黄龙县| 罗山县| 玛曲县| 丹阳市| 得荣县| 建平县| 晴隆县| 浦江县| 巴塘县| 海林市| 湘阴县|