舉報

會員
Effective Amazon Machine Learning
最新章節:
Summary
Thisbookisintendedfordatascientistsandmanagersofpredictiveanalyticsprojects;itwillteachbeginner-toadvanced-levelmachinelearningpractitionershowtoleverageAmazonMachineLearningandcomplementtheirexistingDataSciencetoolbox.NosubstantivepriorknowledgeofMachineLearning,DataScience,statistics,orcodingisrequired.
最新章節
- Summary
- Going beyond classification and regression
- And what is the most popular vegetable?
- Removing duplicate tweets
- Sentiment analysis with TextBlob
- Download the dataset from RedShift
品牌:中圖公司
上架時間:2021-07-02 19:01:05
出版社:Packt Publishing
本書數字版權由中圖公司提供,并由其授權上海閱文信息技術有限公司制作發行
- Summary 更新時間:2021-07-03 00:18:24
- Going beyond classification and regression
- And what is the most popular vegetable?
- Removing duplicate tweets
- Sentiment analysis with TextBlob
- Download the dataset from RedShift
- Analyzing the results
- Preprocessing with Lambda
- Debugging
- Data format synchronization
- Dependencies and debugging
- Setting up the roles and policies
- Adding Redshift to the Kinesis Firehose
- The Redshift database
- Producing tweets
- Setting up Kinesis Firehose
- Kinesis Analytics
- Kinesis Stream
- Kinesis
- The training dataset and the model
- Popularity contest on twitter
- Streaming Twitter sentiment analysis
- Building a Streaming Data Analysis Pipeline
- Summary
- Interpreting the results
- Driving the trials in Python
- Polynomial regression in Amazon ML
- Establishing a baseline
- Introducing polynomial regression
- Uploading the nonlinear data to Redshift
- Creating our own non-linear dataset
- Executing Redshift queries using Psql
- Connecting through the command line
- Creating a Redshift instance
- Choosing between RDS and Redshift
- Creating Datasources from Redshift
- Summary
- Managing schema and recipe
- Implementing recursive feature selection with Boto3
- Wrapping up the Python-based workflow
- Waiting on operation completion
- Working with the Python SDK for Amazon Machine Learning
- Boto3 the Python SDK
- Conclusion
- The results
- Generating the evaluations template
- Generating the models template
- Generating the datasources template
- Generating the shuffled datasets
- Implementing Monte Carlo cross-validation
- What is cross-validation?
- Evaluating our model with create-evaluation
- Creating the model
- Creating the datasource
- An overview of Amazon ML CLI commands
- A simple project using the CLI
- Splitting the dataset with shell commands
- Introducing the Ames Housing dataset
- Passing parameters using JSON files
- Picking up CLI syntax
- Installing AWS CLI
- Using the CLI versus SDK
- Getting started and setting up
- Command Line and SDK
- Summary
- Python SDK
- Setting up AWS CLI
- AWS access keys
- Setting up AWS credentials
- AWS SDK
- Setting up real-time predictions
- Manually exploring variable influence
- Making real-time predictions
- Multiplying trials
- Finding out who will survive
- Evaluating the held-out dataset
- Assessing our predictions
- Reading the results file
- Reading the manifest file
- Interpreting prediction outputs
- Creating the batch prediction job
- Making batch predictions
- Predictions and Performances
- Summary
- Parsing the model logs
- Keeping variables as numeric or applying quantile binning?
- Comparing different recipes on the Titanic dataset
- Impact of regularization
- Visualizing convergence
- Optimizing the learning rate
- Analyzing the logs
- Evaluating multiclass classification
- Evaluating linear regression
- Exploring the model performances
- Evaluating binary classification
- Evaluating the model
- Creating an evaluation
- Choosing regularization
- Setting the number of data passes
- Setting model memory
- Parametrizing the model
- Choosing between recipes and data pre-processing.
- Applying recipes to the Titanic dataset
- Editing the suggested recipe
- Creating a model
- Binning numeric values
- Coupling variables
- Text mining
- Using simple transformations
- Data processing through seven transformations
- Specifying outputs
- Naming variables with assignments
- Grouping variables
- Managing variables
- Transforming data with recipes
- Model Creation
- Summary
- Creating an improved datasource
- Wrapping up
- Calculating family size
- Inferring the deck from the cabin
- Extracting the title from the name
- Handling outliers in the fare
- Missing values
- Data munging in SQL
- Creating the database and table directly in SQL
- Using the wizard
- Creating a titanic database
- A brief tour of AWS Athena
- Introducing Athena
- Feature engineering with Athena
- Examining data statistics
- Reusing the schema
- Verifying the data schema
- Creating the datasource
- Formatting the data
- Granting permissions
- Loading the data
- Creating a bucket
- Loading data on S3
- Splitting the data
- Preparing the data
- Introducing the Titanic dataset
- Finding open datasets
- Working with datasets
- Loading and Preparing the Dataset
- Summary
- Making batch predictions
- Comparing with a baseline
- The evaluation of the model
- The model
- Creating the datasource
- Declaring a datasource
- Loading the data on S3
- The dataset
- Overview of a standard Amazon Machine Learning workflow
- Choosing a region
- Creating login credentials
- Defining policies
- Creating a user
- Setting up the account
- Security
- Opening an Amazon Web Services Account
- Overview of an Amazon Machine Learning Workflow
- Summary
- Evaluating the performance of your model
- L1 regularization and Lasso
- L2 regularization and Ridge
- Regularization on linear models
- Overfitting
- Underfitting
- Identifying and correcting poor performances
- Training and evaluation in Amazon ML
- The predictive analytics workflow
- Preprocessing recapitulation
- Adding features?
- Accepting non-linear patterns
- Detecting outliers
- Addressing multicollinearity
- Imbalanced datasets
- Normalization
- Missing values
- Assumptions for multiclass linear models
- Classic datasets versus real-world datasets
- Dealing with messy data
- What's an algorithm? What's a model?
- Machine Learning Definitions and Concepts
- Summary
- The statistical approach versus the machine learning approach
- Missing from Amazon ML
- Validating the dataset
- Diving further into linear modeling for prediction
- Extracting features to predict outcomes
- Expanding regression to classification with logistic regression
- Regression versus classification
- Building the simplest predictive analytics algorithm
- Understanding predictive analytics
- Pricing
- Amazon's expertise and the gradient descent algorithm
- Engineering data versus model variety
- Comparing performances
- Leveraging full AWS integration
- Machine Learning as a Service
- Introducing Amazon Machine Learning
- Introduction to Machine Learning and Predictive Analytics
- Questions
- Piracy
- Errata
- Downloading the example code
- Customer support
- Reader feedback
- Conventions
- Who this book is for
- What you need for this book
- What this book covers
- Preface
- Dedication
- Customer Feedback
- www.PacktPub.com
- About the Reviewer
- About the Author
- Credits
- 版權信息
- 封面
- 封面
- 版權信息
- Credits
- About the Author
- About the Reviewer
- www.PacktPub.com
- Customer Feedback
- Dedication
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Downloading the example code
- Errata
- Piracy
- Questions
- Introduction to Machine Learning and Predictive Analytics
- Introducing Amazon Machine Learning
- Machine Learning as a Service
- Leveraging full AWS integration
- Comparing performances
- Engineering data versus model variety
- Amazon's expertise and the gradient descent algorithm
- Pricing
- Understanding predictive analytics
- Building the simplest predictive analytics algorithm
- Regression versus classification
- Expanding regression to classification with logistic regression
- Extracting features to predict outcomes
- Diving further into linear modeling for prediction
- Validating the dataset
- Missing from Amazon ML
- The statistical approach versus the machine learning approach
- Summary
- Machine Learning Definitions and Concepts
- What's an algorithm? What's a model?
- Dealing with messy data
- Classic datasets versus real-world datasets
- Assumptions for multiclass linear models
- Missing values
- Normalization
- Imbalanced datasets
- Addressing multicollinearity
- Detecting outliers
- Accepting non-linear patterns
- Adding features?
- Preprocessing recapitulation
- The predictive analytics workflow
- Training and evaluation in Amazon ML
- Identifying and correcting poor performances
- Underfitting
- Overfitting
- Regularization on linear models
- L2 regularization and Ridge
- L1 regularization and Lasso
- Evaluating the performance of your model
- Summary
- Overview of an Amazon Machine Learning Workflow
- Opening an Amazon Web Services Account
- Security
- Setting up the account
- Creating a user
- Defining policies
- Creating login credentials
- Choosing a region
- Overview of a standard Amazon Machine Learning workflow
- The dataset
- Loading the data on S3
- Declaring a datasource
- Creating the datasource
- The model
- The evaluation of the model
- Comparing with a baseline
- Making batch predictions
- Summary
- Loading and Preparing the Dataset
- Working with datasets
- Finding open datasets
- Introducing the Titanic dataset
- Preparing the data
- Splitting the data
- Loading data on S3
- Creating a bucket
- Loading the data
- Granting permissions
- Formatting the data
- Creating the datasource
- Verifying the data schema
- Reusing the schema
- Examining data statistics
- Feature engineering with Athena
- Introducing Athena
- A brief tour of AWS Athena
- Creating a titanic database
- Using the wizard
- Creating the database and table directly in SQL
- Data munging in SQL
- Missing values
- Handling outliers in the fare
- Extracting the title from the name
- Inferring the deck from the cabin
- Calculating family size
- Wrapping up
- Creating an improved datasource
- Summary
- Model Creation
- Transforming data with recipes
- Managing variables
- Grouping variables
- Naming variables with assignments
- Specifying outputs
- Data processing through seven transformations
- Using simple transformations
- Text mining
- Coupling variables
- Binning numeric values
- Creating a model
- Editing the suggested recipe
- Applying recipes to the Titanic dataset
- Choosing between recipes and data pre-processing.
- Parametrizing the model
- Setting model memory
- Setting the number of data passes
- Choosing regularization
- Creating an evaluation
- Evaluating the model
- Evaluating binary classification
- Exploring the model performances
- Evaluating linear regression
- Evaluating multiclass classification
- Analyzing the logs
- Optimizing the learning rate
- Visualizing convergence
- Impact of regularization
- Comparing different recipes on the Titanic dataset
- Keeping variables as numeric or applying quantile binning?
- Parsing the model logs
- Summary
- Predictions and Performances
- Making batch predictions
- Creating the batch prediction job
- Interpreting prediction outputs
- Reading the manifest file
- Reading the results file
- Assessing our predictions
- Evaluating the held-out dataset
- Finding out who will survive
- Multiplying trials
- Making real-time predictions
- Manually exploring variable influence
- Setting up real-time predictions
- AWS SDK
- Setting up AWS credentials
- AWS access keys
- Setting up AWS CLI
- Python SDK
- Summary
- Command Line and SDK
- Getting started and setting up
- Using the CLI versus SDK
- Installing AWS CLI
- Picking up CLI syntax
- Passing parameters using JSON files
- Introducing the Ames Housing dataset
- Splitting the dataset with shell commands
- A simple project using the CLI
- An overview of Amazon ML CLI commands
- Creating the datasource
- Creating the model
- Evaluating our model with create-evaluation
- What is cross-validation?
- Implementing Monte Carlo cross-validation
- Generating the shuffled datasets
- Generating the datasources template
- Generating the models template
- Generating the evaluations template
- The results
- Conclusion
- Boto3 the Python SDK
- Working with the Python SDK for Amazon Machine Learning
- Waiting on operation completion
- Wrapping up the Python-based workflow
- Implementing recursive feature selection with Boto3
- Managing schema and recipe
- Summary
- Creating Datasources from Redshift
- Choosing between RDS and Redshift
- Creating a Redshift instance
- Connecting through the command line
- Executing Redshift queries using Psql
- Creating our own non-linear dataset
- Uploading the nonlinear data to Redshift
- Introducing polynomial regression
- Establishing a baseline
- Polynomial regression in Amazon ML
- Driving the trials in Python
- Interpreting the results
- Summary
- Building a Streaming Data Analysis Pipeline
- Streaming Twitter sentiment analysis
- Popularity contest on twitter
- The training dataset and the model
- Kinesis
- Kinesis Stream
- Kinesis Analytics
- Setting up Kinesis Firehose
- Producing tweets
- The Redshift database
- Adding Redshift to the Kinesis Firehose
- Setting up the roles and policies
- Dependencies and debugging
- Data format synchronization
- Debugging
- Preprocessing with Lambda
- Analyzing the results
- Download the dataset from RedShift
- Sentiment analysis with TextBlob
- Removing duplicate tweets
- And what is the most popular vegetable?
- Going beyond classification and regression
- Summary 更新時間:2021-07-03 00:18:24