舉報

會員
Effective Amazon Machine Learning
最新章節(jié):
Summary
Thisbookisintendedfordatascientistsandmanagersofpredictiveanalyticsprojects;itwillteachbeginner-toadvanced-levelmachinelearningpractitionershowtoleverageAmazonMachineLearningandcomplementtheirexistingDataSciencetoolbox.NosubstantivepriorknowledgeofMachineLearning,DataScience,statistics,orcodingisrequired.
目錄(224章)
倒序
- 封面
- 版權(quán)信息
- Credits
- About the Author
- About the Reviewer
- www.PacktPub.com
- Customer Feedback
- Dedication
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Downloading the example code
- Errata
- Piracy
- Questions
- Introduction to Machine Learning and Predictive Analytics
- Introducing Amazon Machine Learning
- Machine Learning as a Service
- Leveraging full AWS integration
- Comparing performances
- Engineering data versus model variety
- Amazon's expertise and the gradient descent algorithm
- Pricing
- Understanding predictive analytics
- Building the simplest predictive analytics algorithm
- Regression versus classification
- Expanding regression to classification with logistic regression
- Extracting features to predict outcomes
- Diving further into linear modeling for prediction
- Validating the dataset
- Missing from Amazon ML
- The statistical approach versus the machine learning approach
- Summary
- Machine Learning Definitions and Concepts
- What's an algorithm? What's a model?
- Dealing with messy data
- Classic datasets versus real-world datasets
- Assumptions for multiclass linear models
- Missing values
- Normalization
- Imbalanced datasets
- Addressing multicollinearity
- Detecting outliers
- Accepting non-linear patterns
- Adding features?
- Preprocessing recapitulation
- The predictive analytics workflow
- Training and evaluation in Amazon ML
- Identifying and correcting poor performances
- Underfitting
- Overfitting
- Regularization on linear models
- L2 regularization and Ridge
- L1 regularization and Lasso
- Evaluating the performance of your model
- Summary
- Overview of an Amazon Machine Learning Workflow
- Opening an Amazon Web Services Account
- Security
- Setting up the account
- Creating a user
- Defining policies
- Creating login credentials
- Choosing a region
- Overview of a standard Amazon Machine Learning workflow
- The dataset
- Loading the data on S3
- Declaring a datasource
- Creating the datasource
- The model
- The evaluation of the model
- Comparing with a baseline
- Making batch predictions
- Summary
- Loading and Preparing the Dataset
- Working with datasets
- Finding open datasets
- Introducing the Titanic dataset
- Preparing the data
- Splitting the data
- Loading data on S3
- Creating a bucket
- Loading the data
- Granting permissions
- Formatting the data
- Creating the datasource
- Verifying the data schema
- Reusing the schema
- Examining data statistics
- Feature engineering with Athena
- Introducing Athena
- A brief tour of AWS Athena
- Creating a titanic database
- Using the wizard
- Creating the database and table directly in SQL
- Data munging in SQL
- Missing values
- Handling outliers in the fare
- Extracting the title from the name
- Inferring the deck from the cabin
- Calculating family size
- Wrapping up
- Creating an improved datasource
- Summary
- Model Creation
- Transforming data with recipes
- Managing variables
- Grouping variables
- Naming variables with assignments
- Specifying outputs
- Data processing through seven transformations
- Using simple transformations
- Text mining
- Coupling variables
- Binning numeric values
- Creating a model
- Editing the suggested recipe
- Applying recipes to the Titanic dataset
- Choosing between recipes and data pre-processing.
- Parametrizing the model
- Setting model memory
- Setting the number of data passes
- Choosing regularization
- Creating an evaluation
- Evaluating the model
- Evaluating binary classification
- Exploring the model performances
- Evaluating linear regression
- Evaluating multiclass classification
- Analyzing the logs
- Optimizing the learning rate
- Visualizing convergence
- Impact of regularization
- Comparing different recipes on the Titanic dataset
- Keeping variables as numeric or applying quantile binning?
- Parsing the model logs
- Summary
- Predictions and Performances
- Making batch predictions
- Creating the batch prediction job
- Interpreting prediction outputs
- Reading the manifest file
- Reading the results file
- Assessing our predictions
- Evaluating the held-out dataset
- Finding out who will survive
- Multiplying trials
- Making real-time predictions
- Manually exploring variable influence
- Setting up real-time predictions
- AWS SDK
- Setting up AWS credentials
- AWS access keys
- Setting up AWS CLI
- Python SDK
- Summary
- Command Line and SDK
- Getting started and setting up
- Using the CLI versus SDK
- Installing AWS CLI
- Picking up CLI syntax
- Passing parameters using JSON files
- Introducing the Ames Housing dataset
- Splitting the dataset with shell commands
- A simple project using the CLI
- An overview of Amazon ML CLI commands
- Creating the datasource
- Creating the model
- Evaluating our model with create-evaluation
- What is cross-validation?
- Implementing Monte Carlo cross-validation
- Generating the shuffled datasets
- Generating the datasources template
- Generating the models template
- Generating the evaluations template
- The results
- Conclusion
- Boto3 the Python SDK
- Working with the Python SDK for Amazon Machine Learning
- Waiting on operation completion
- Wrapping up the Python-based workflow
- Implementing recursive feature selection with Boto3
- Managing schema and recipe
- Summary
- Creating Datasources from Redshift
- Choosing between RDS and Redshift
- Creating a Redshift instance
- Connecting through the command line
- Executing Redshift queries using Psql
- Creating our own non-linear dataset
- Uploading the nonlinear data to Redshift
- Introducing polynomial regression
- Establishing a baseline
- Polynomial regression in Amazon ML
- Driving the trials in Python
- Interpreting the results
- Summary
- Building a Streaming Data Analysis Pipeline
- Streaming Twitter sentiment analysis
- Popularity contest on twitter
- The training dataset and the model
- Kinesis
- Kinesis Stream
- Kinesis Analytics
- Setting up Kinesis Firehose
- Producing tweets
- The Redshift database
- Adding Redshift to the Kinesis Firehose
- Setting up the roles and policies
- Dependencies and debugging
- Data format synchronization
- Debugging
- Preprocessing with Lambda
- Analyzing the results
- Download the dataset from RedShift
- Sentiment analysis with TextBlob
- Removing duplicate tweets
- And what is the most popular vegetable?
- Going beyond classification and regression
- Summary 更新時間:2021-07-03 00:18:24
推薦閱讀
- Greenplum:從大數(shù)據(jù)戰(zhàn)略到實現(xiàn)
- Python數(shù)據(jù)挖掘:入門、進階與實用案例分析
- 大數(shù)據(jù):規(guī)劃、實施、運維
- Enterprise Integration with WSO2 ESB
- Learn Unity ML-Agents:Fundamentals of Unity Machine Learning
- Python金融實戰(zhàn)
- Spark大數(shù)據(jù)分析實戰(zhàn)
- Python金融數(shù)據(jù)分析(原書第2版)
- “互聯(lián)網(wǎng)+”時代立體化計算機組
- HikariCP連接池實戰(zhàn)
- Hadoop 3實戰(zhàn)指南
- 數(shù)據(jù)修復技術(shù)與典型實例實戰(zhàn)詳解(第2版)
- Access數(shù)據(jù)庫開發(fā)從入門到精通
- 數(shù)據(jù)賦能
- 大數(shù)據(jù)分析:R基礎及應用
- 企業(yè)大數(shù)據(jù)處理:Spark、Druid、Flume與Kafka應用實踐
- Google Cloud Platform for Architects
- Unity for Architectural Visualization
- SOLIDWORKS 2018中文版機械設計基礎與實例教程
- 大數(shù)據(jù)計算系統(tǒng)原理、技術(shù)與應用
- 數(shù)據(jù)分析方法及應用:基于SPSS和EXCEL環(huán)境
- 敏捷數(shù)據(jù)分析工具箱:深入解析ADW+OAC
- Learning Construct 2
- MySQL技術(shù)內(nèi)幕:InnoDB存儲引擎(第2版)
- Foxtable數(shù)據(jù)庫應用開發(fā)寶典
- 云存儲安全:大數(shù)據(jù)分析與計算的基石
- 計算機應用基礎項目化教程(微課版)
- MySQL數(shù)據(jù)庫項目化教程
- 數(shù)據(jù)庫系統(tǒng):原理、設計與編程(MOOC版)
- 數(shù)據(jù)庫系統(tǒng)內(nèi)幕