舉報

會員
Python Machine Learning By Example
Thesurgeininterestinmachinelearning(ML)isduetothefactthatitrevolutionizesautomationbylearningpatternsindataandusingthemtomakepredictionsanddecisions.Ifyou’reinterestedinML,thisbookwillserveasyourentrypointtoML.PythonMachineLearningByExamplebeginswithanintroductiontoimportantMLconceptsandimplementationsusingPythonlibraries.Eachchapterofthebookwalksyouthroughanindustryadoptedapplication.You’llimplementMLtechniquesinareassuchasexploratorydataanalysis,featureengineering,andnaturallanguageprocessing(NLP)inaclearandeasy-to-followway.Withthehelpofthisextendedandupdatededition,you’llunderstandhowtotackledata-drivenproblemsandimplementyoursolutionswiththepowerfulyetsimplePythonlanguageandpopularPythonpackagesandtoolssuchasTensorFlow,scikit-learn,gensim,andKeras.ToaidyourunderstandingofpopularMLalgorithms,thebookcoversinterestingandeasy-to-followexamplessuchasnewstopicmodelingandclassification,spamemaildetection,stockpriceforecasting,andmore.Bytheendofthebook,you’llhaveputtogetherabroadpictureoftheMLecosystemandwillbewell-versedwiththebestpracticesofapplyingMLtechniquestomakethemostoutofnewopportunities.
目錄(223章)
倒序
- coverpage
- Title Page
- Copyright and Credits
- Python Machine Learning By Example Second Edition
- About Packt
- Why subscribe?
- Packt.com
- Dedication
- Foreword
- Contributors
- About the author
- About the reviewer
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Get in touch
- Reviews
- Section 1: Fundamentals of Machine Learning
- Getting Started with Machine Learning and Python
- Defining machine learning and why we need it
- A very high-level overview of machine learning technology
- Types of machine learning tasks
- A brief history of the development of machine learning algorithms
- Core of machine learning – generalizing with data
- Overfitting underfitting and the bias-variance trade-off
- Avoiding overfitting with cross-validation
- Avoiding overfitting with regularization
- Avoiding overfitting with feature selection and dimensionality reduction
- Preprocessing exploration and feature engineering
- Missing values
- Label encoding
- One hot encoding
- Scaling
- Polynomial features
- Power transform
- Binning
- Combining models
- Voting and averaging
- Bagging
- Boosting
- Stacking
- Installing software and setting up
- Setting up Python and environments
- Installing the various packages
- NumPy
- SciPy
- Pandas
- Scikit-learn
- TensorFlow
- Summary
- Exercises
- Section 2: Practical Python Machine Learning By Example
- Exploring the 20 Newsgroups Dataset with Text Analysis Techniques
- How computers understand language - NLP
- Picking up NLP basics while touring popular NLP libraries
- Corpus
- Tokenization
- PoS tagging
- Named-entity recognition
- Stemming and lemmatization
- Semantics and topic modeling
- Getting the newsgroups data
- Exploring the newsgroups data
- Thinking about features for text data
- Counting the occurrence of each word token
- Text preprocessing
- Dropping stop words
- Stemming and lemmatizing words
- Visualizing the newsgroups data with t-SNE
- What is dimensionality reduction?
- t-SNE for dimensionality reduction
- Summary
- Exercises
- Mining the 20 Newsgroups Dataset with Clustering and Topic Modeling Algorithms
- Learning without guidance – unsupervised learning
- Clustering newsgroups data using k-means
- How does k-means clustering work?
- Implementing k-means from scratch
- Implementing k-means with scikit-learn
- Choosing the value of k
- Clustering newsgroups data using k-means
- Discovering underlying topics in newsgroups
- Topic modeling using NMF
- Topic modeling using LDA
- Summary
- Exercises
- Detecting Spam Email with Naive Bayes
- Getting started with classification
- Types of classification
- Applications of text classification
- Exploring Na?ve Bayes
- Learning Bayes' theorem by examples
- The mechanics of Na?ve Bayes
- Implementing Na?ve Bayes from scratch
- Implementing Na?ve Bayes with scikit-learn
- Classification performance evaluation
- Model tuning and cross-validation
- Summary
- Exercise
- Classifying Newsgroup Topics with Support Vector Machines
- Finding separating boundary with support vector machines
- Understanding how SVM works through different use cases
- Case 1 – identifying a separating hyperplane
- Case 2 – determining the optimal hyperplane
- Case 3 – handling outliers
- Implementing SVM
- Case 4 – dealing with more than two classes
- The kernels of SVM
- Case 5 – solving linearly non-separable problems
- Choosing between linear and RBF kernels
- Classifying newsgroup topics with SVMs
- More example – fetal state classification on cardiotocography
- A further example – breast cancer classification using SVM with TensorFlow
- Summary
- Exercise
- Predicting Online Ad Click-Through with Tree-Based Algorithms
- Brief overview of advertising click-through prediction
- Getting started with two types of data – numerical and categorical
- Exploring decision tree from root to leaves
- Constructing a decision tree
- The metrics for measuring a split
- Implementing a decision tree from scratch
- Predicting ad click-through with decision tree
- Ensembling decision trees – random forest
- Implementing random forest using TensorFlow
- Summary
- Exercise
- Predicting Online Ad Click-Through with Logistic Regression
- Converting categorical features to numerical – one-hot encoding and ordinal encoding
- Classifying data with logistic regression
- Getting started with the logistic function
- Jumping from the logistic function to logistic regression
- Training a logistic regression model
- Training a logistic regression model using gradient descent
- Predicting ad click-through with logistic regression using gradient descent
- Training a logistic regression model using stochastic gradient descent
- Training a logistic regression model with regularization
- Training on large datasets with online learning
- Handling multiclass classification
- Implementing logistic regression using TensorFlow
- Feature selection using random forest
- Summary
- Exercises
- Scaling Up Prediction to Terabyte Click Logs
- Learning the essentials of Apache Spark
- Breaking down Spark
- Installing Spark
- Launching and deploying Spark programs
- Programming in PySpark
- Learning on massive click logs with Spark
- Loading click logs
- Splitting and caching the data
- One-hot encoding categorical features
- Training and testing a logistic regression model
- Feature engineering on categorical variables with Spark
- Hashing categorical features
- Combining multiple variables – feature interaction
- Summary
- Exercises
- Stock Price Prediction with Regression Algorithms
- Brief overview of the stock market and stock prices
- What is regression?
- Mining stock price data
- Getting started with feature engineering
- Acquiring data and generating features
- Estimating with linear regression
- How does linear regression work?
- Implementing linear regression
- Estimating with decision tree regression
- Transitioning from classification trees to regression trees
- Implementing decision tree regression
- Implementing regression forest
- Estimating with support vector regression
- Implementing SVR
- Estimating with neural networks
- Demystifying neural networks
- Implementing neural networks
- Evaluating regression performance
- Predicting stock price with four regression algorithms
- Summary
- Exercise
- Section 3: Python Machine Learning Best Practices
- Machine Learning Best Practices
- Machine learning solution workflow
- Best practices in the data preparation stage
- Best practice 1 – completely understanding the project goal
- Best practice 2 – collecting all fields that are relevant
- Best practice 3 – maintaining the consistency of field values
- Best practice 4 – dealing with missing data
- Best practice 5 – storing large-scale data
- Best practices in the training sets generation stage
- Best practice 6 – identifying categorical features with numerical values
- Best practice 7 – deciding on whether or not to encode categorical features
- Best practice 8 – deciding on whether or not to select features and if so how to do so
- Best practice 9 – deciding on whether or not to reduce dimensionality and if so how to do so
- Best practice 10 – deciding on whether or not to rescale features
- Best practice 11 – performing feature engineering with domain expertise
- Best practice 12 – performing feature engineering without domain expertise
- Best practice 13 – documenting how each feature is generated
- Best practice 14 – extracting features from text data
- Best practices in the model training evaluation and selection stage
- Best practice 15 – choosing the right algorithm(s) to start with
- Na?ve Bayes
- Logistic regression
- SVM
- Random forest (or decision tree)
- Neural networks
- Best practice 16 – reducing overfitting
- Best practice 17 – diagnosing overfitting and underfitting
- Best practice 18 – modeling on large-scale datasets
- Best practices in the deployment and monitoring stage
- Best practice 19 – saving loading and reusing models
- Best practice 20 – monitoring model performance
- Best practice 21 – updating models regularly
- Summary
- Exercises
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-07-02 12:42:13
推薦閱讀
- 現代測控系統典型應用實例
- 玩轉智能機器人程小奔
- Design for the Future
- Seven NoSQL Databases in a Week
- 電腦上網直通車
- Google App Inventor
- 構建高性能Web站點
- CompTIA Linux+ Certification Guide
- Unity Multiplayer Games
- 單片機原理實用教程
- 大數據案例精析
- 生物3D打印:從醫療輔具制造到細胞打印
- 項目實踐精解:C#核心技術應用開發
- 單片機C51應用技術
- Generative Adversarial Networks Projects
- 基于Quartus Ⅱ的數字系統Verilog HDL設計實例詳解
- 多媒體技術應用教程
- Mastering Windows Group Policy
- 服務科學概論
- ARM? Cortex? M4 Cookbook
- 商業周刊/中文版·天地無人:無人技術專刊(商業周刊/中文版)
- Network Science with Python and NetworkX Quick Start Guide
- Bazaar Version Control
- Mastering PostgreSQL 9.6
- 新一代綠色數據中心的規劃與設計
- 人工智能大冒險:青少年的AI啟蒙書
- 高級PLC硬件和編程:基于Allen-Bradley和Siemens平臺的軟、硬件基礎和高級技術
- ABB工業機器人進階編程與應用
- 大數據與人工智能導論
- Getting Started with Lumion 3D