舉報

會員
Learning Data Mining with Python(Second Edition)
最新章節:
Coursera
IfyouareaPythonprogrammerwhowantstogetstartedwithdatamining,thenthisbookisforyou.IfyouareadataanalystwhowantstoleveragethepowerofPythontoperformdataminingefficiently,thisbookwillalsohelpyou.Nopreviousexperiencewithdataminingisexpected.
- Coursera 更新時間:2021-07-02 23:40:49
- Kaggle competitions
- More resources
- W.I.L.L
- Recommendation engine
- Pydoop
- Courses on Hadoop
- Working with Big Data
- Magenta
- Mahotas
- Classifying Objects in Images Using Deep Learning
- Real-time clusterings
- Temporal analysis
- Clustering Evaluation
- Clustering News Articles
- Local n-grams
- Blogs dataset
- Increasing the sample size
- Authorship Attribution
- Reinforcement learning
- Deeper networks
- Better (worse?) CAPTCHAs
- Beating CAPTCHAs with Neural Networks
- NetworkX
- More complex algorithms
- Discovering Accounts to Follow Using Graph Mining
- Natural language processing and part-of-speech tagging
- Spam detection
- Social Media Insight Using Naive Bayes
- word2vec
- Vowpal Wabbit
- Adding noise
- Extracting Features with Transformers
- Collaborative Filtering
- The Eclat algorithm
- New datasets
- Recommending Movies Using Affinity Analysis
- Research
- Dask
- More complex features
- Predicting Sports Winners with Decision Trees
- Automated Learning
- Comparing classifiers
- More complex pipelines
- Scalability with the nearest neighbor
- Classifying with scikit-learn Estimators
- More application ideas
- Other Evaluation Metrics
- More datasets
- Extending the Jupyter Notebook
- Scikit-learn tutorials
- Getting Started with Data Mining
- Next Steps...
- Summary
- Training on Amazon's EMR infrastructure
- Putting it all together
- Training Naive Bayes
- Extracting the blog posts
- The mrjob package
- Naive Bayes prediction
- Getting the data
- Applying MapReduce
- Hadoop MapReduce
- A word count example
- The intuition behind MapReduce
- MapReduce
- Applications of big data
- Big data
- Working with Big Data
- Summary
- Putting it all together
- Creating the neural network
- Getting the data
- Application
- Setting up the environment
- Running our code on a GPU
- When to use GPUs for computation
- GPU optimization
- Convolutional Neural Networks
- Using Keras
- An Introduction to TensorFlow
- Implementing deep neural networks
- Intuition
- Deep neural networks
- Application scenario
- Use cases
- Object classification
- Object Detection in Images using Deep Neural Networks
- Summary
- Implementation
- Online learning
- Implementation
- How it works
- Evidence accumulation
- Clustering ensembles
- Using clustering algorithms as transformers
- Extracting topic information from clusters
- Evaluating the results
- The k-means algorithm
- Grouping news articles
- Extracting the content
- Finding the stories in arbitrary websites
- Extracting text from arbitrary websites
- Getting the data
- Reddit as a data source
- Using a web API to get data
- Trending topic discovery
- Clustering News Articles
- Summary
- Evaluation
- Putting it all together
- Creating a dataset loader
- Accessing the Enron dataset
- The Enron dataset
- Extracting character n-grams
- Character n-grams
- Kernels
- Classifying with SVMs
- Support Vector Machines
- Classifying with function words
- Counting function words
- Using function words
- Getting the data
- Authorship attribution
- Applications and use cases
- Attributing documents to authors
- Authorship Attribution
- Summary
- Putting it all together
- Ranking mechanisms for word similarity
- Improving accuracy using a dictionary
- Predicting words
- Back-propagation
- Training and classifying
- Creating a training dataset
- Splitting the image into individual letters
- Drawing basic CAPTCHAs
- Creating the dataset
- An introduction to neural networks
- Artificial neural networks
- Beating CAPTCHAs with Neural Networks
- Summary
- Optimizing criteria
- Connected components
- Finding subgraphs
- Creating a similarity graph
- Creating a graph
- Building the network
- Getting follower information from Twitter
- Classifying with an existing model
- Loading the dataset
- Follow Recommendations Using Graph Mining
- Summary
- Getting useful features from models
- Evaluation using the F1-score
- Putting it all together
- Converting dictionaries to a matrix
- Extracting word counts
- Applying of Naive Bayes
- How it works
- Naive Bayes algorithm
- Understanding Bayes' theorem
- Naive Bayes
- Other text features
- n-gram features
- Bag-of-words models
- Text transformers
- Creating a replicable dataset from Twitter
- Loading and classifying the dataset
- Downloading data from a social network
- Disambiguation
- Social Media Insight using Naive Bayes
- Summary
- Putting it all together
- Unit testing
- Implementing a Transformer
- The transformer API
- Creating your own transformer
- Principal Component Analysis
- Feature creation
- Selecting the best individual features
- Feature selection
- Creating good features
- Common feature patterns
- Representing reality in models
- Feature extraction
- Features and scikit-learn Transformers
- Summary
- Evaluating the association rules
- Extracting association rules
- Implementing the Apriori algorithm
- Looking into the basics of the Apriori algorithm
- Understanding the Apriori algorithm and its implementation
- Sparse data formats
- Loading with pandas
- Obtaining the dataset
- Dealing with the movie recommendation problem
- Overall methodology
- Algorithms for affinity analysis
- Affinity analysis
- Recommending Movies Using Affinity Analysis
- Summary
- Engineering new features
- Applying random forests
- Setting parameters in Random Forests
- How do ensembles work?
- Random forests
- Putting it all together
- Sports outcome prediction
- Using decision trees
- Parameters in decision trees
- Decision trees
- Extracting new features
- Cleaning up the dataset
- Using pandas to load the dataset
- Collecting the data
- Loading the dataset
- Predicting Sports Winners with Decision Trees
- Summary
- Pipelines
- Putting it all together
- Standard pre-processing
- Preprocessing
- Setting parameters
- Running the algorithm
- Moving towards a standard workflow
- Loading the dataset
- Distance metrics
- Nearest neighbors
- scikit-learn estimators
- Classifying with scikit-learn Estimators
- Summary
- Testing the algorithm
- Implementing the OneR algorithm
- Loading and preparing the dataset
- What is classification?
- A simple classification example
- Ranking to find the best rules
- Implementing a simple ranking of rules
- Downloading the example code
- Loading the dataset with NumPy
- Product recommendations
- What is affinity analysis?
- A simple affinity analysis example
- Installing scikit-learn
- Installing Jupyter Notebook
- Installing Python
- Using Python and the Jupyter Notebook
- Introducing data mining
- Getting Started with Data Mining
- Questions
- Piracy
- Errata
- Downloading the example code
- Customer support
- Reader feedback
- Conventions
- Who this book is for
- What you need for this book
- What this book covers
- Preface
- Customer Feedback
- www.PacktPub.com
- About the Reviewer
- About the Author
- Credits
- 版權信息
- 封面
- 封面
- 版權信息
- Credits
- About the Author
- About the Reviewer
- www.PacktPub.com
- Customer Feedback
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Downloading the example code
- Errata
- Piracy
- Questions
- Getting Started with Data Mining
- Introducing data mining
- Using Python and the Jupyter Notebook
- Installing Python
- Installing Jupyter Notebook
- Installing scikit-learn
- A simple affinity analysis example
- What is affinity analysis?
- Product recommendations
- Loading the dataset with NumPy
- Downloading the example code
- Implementing a simple ranking of rules
- Ranking to find the best rules
- A simple classification example
- What is classification?
- Loading and preparing the dataset
- Implementing the OneR algorithm
- Testing the algorithm
- Summary
- Classifying with scikit-learn Estimators
- scikit-learn estimators
- Nearest neighbors
- Distance metrics
- Loading the dataset
- Moving towards a standard workflow
- Running the algorithm
- Setting parameters
- Preprocessing
- Standard pre-processing
- Putting it all together
- Pipelines
- Summary
- Predicting Sports Winners with Decision Trees
- Loading the dataset
- Collecting the data
- Using pandas to load the dataset
- Cleaning up the dataset
- Extracting new features
- Decision trees
- Parameters in decision trees
- Using decision trees
- Sports outcome prediction
- Putting it all together
- Random forests
- How do ensembles work?
- Setting parameters in Random Forests
- Applying random forests
- Engineering new features
- Summary
- Recommending Movies Using Affinity Analysis
- Affinity analysis
- Algorithms for affinity analysis
- Overall methodology
- Dealing with the movie recommendation problem
- Obtaining the dataset
- Loading with pandas
- Sparse data formats
- Understanding the Apriori algorithm and its implementation
- Looking into the basics of the Apriori algorithm
- Implementing the Apriori algorithm
- Extracting association rules
- Evaluating the association rules
- Summary
- Features and scikit-learn Transformers
- Feature extraction
- Representing reality in models
- Common feature patterns
- Creating good features
- Feature selection
- Selecting the best individual features
- Feature creation
- Principal Component Analysis
- Creating your own transformer
- The transformer API
- Implementing a Transformer
- Unit testing
- Putting it all together
- Summary
- Social Media Insight using Naive Bayes
- Disambiguation
- Downloading data from a social network
- Loading and classifying the dataset
- Creating a replicable dataset from Twitter
- Text transformers
- Bag-of-words models
- n-gram features
- Other text features
- Naive Bayes
- Understanding Bayes' theorem
- Naive Bayes algorithm
- How it works
- Applying of Naive Bayes
- Extracting word counts
- Converting dictionaries to a matrix
- Putting it all together
- Evaluation using the F1-score
- Getting useful features from models
- Summary
- Follow Recommendations Using Graph Mining
- Loading the dataset
- Classifying with an existing model
- Getting follower information from Twitter
- Building the network
- Creating a graph
- Creating a similarity graph
- Finding subgraphs
- Connected components
- Optimizing criteria
- Summary
- Beating CAPTCHAs with Neural Networks
- Artificial neural networks
- An introduction to neural networks
- Creating the dataset
- Drawing basic CAPTCHAs
- Splitting the image into individual letters
- Creating a training dataset
- Training and classifying
- Back-propagation
- Predicting words
- Improving accuracy using a dictionary
- Ranking mechanisms for word similarity
- Putting it all together
- Summary
- Authorship Attribution
- Attributing documents to authors
- Applications and use cases
- Authorship attribution
- Getting the data
- Using function words
- Counting function words
- Classifying with function words
- Support Vector Machines
- Classifying with SVMs
- Kernels
- Character n-grams
- Extracting character n-grams
- The Enron dataset
- Accessing the Enron dataset
- Creating a dataset loader
- Putting it all together
- Evaluation
- Summary
- Clustering News Articles
- Trending topic discovery
- Using a web API to get data
- Reddit as a data source
- Getting the data
- Extracting text from arbitrary websites
- Finding the stories in arbitrary websites
- Extracting the content
- Grouping news articles
- The k-means algorithm
- Evaluating the results
- Extracting topic information from clusters
- Using clustering algorithms as transformers
- Clustering ensembles
- Evidence accumulation
- How it works
- Implementation
- Online learning
- Implementation
- Summary
- Object Detection in Images using Deep Neural Networks
- Object classification
- Use cases
- Application scenario
- Deep neural networks
- Intuition
- Implementing deep neural networks
- An Introduction to TensorFlow
- Using Keras
- Convolutional Neural Networks
- GPU optimization
- When to use GPUs for computation
- Running our code on a GPU
- Setting up the environment
- Application
- Getting the data
- Creating the neural network
- Putting it all together
- Summary
- Working with Big Data
- Big data
- Applications of big data
- MapReduce
- The intuition behind MapReduce
- A word count example
- Hadoop MapReduce
- Applying MapReduce
- Getting the data
- Naive Bayes prediction
- The mrjob package
- Extracting the blog posts
- Training Naive Bayes
- Putting it all together
- Training on Amazon's EMR infrastructure
- Summary
- Next Steps...
- Getting Started with Data Mining
- Scikit-learn tutorials
- Extending the Jupyter Notebook
- More datasets
- Other Evaluation Metrics
- More application ideas
- Classifying with scikit-learn Estimators
- Scalability with the nearest neighbor
- More complex pipelines
- Comparing classifiers
- Automated Learning
- Predicting Sports Winners with Decision Trees
- More complex features
- Dask
- Research
- Recommending Movies Using Affinity Analysis
- New datasets
- The Eclat algorithm
- Collaborative Filtering
- Extracting Features with Transformers
- Adding noise
- Vowpal Wabbit
- word2vec
- Social Media Insight Using Naive Bayes
- Spam detection
- Natural language processing and part-of-speech tagging
- Discovering Accounts to Follow Using Graph Mining
- More complex algorithms
- NetworkX
- Beating CAPTCHAs with Neural Networks
- Better (worse?) CAPTCHAs
- Deeper networks
- Reinforcement learning
- Authorship Attribution
- Increasing the sample size
- Blogs dataset
- Local n-grams
- Clustering News Articles
- Clustering Evaluation
- Temporal analysis
- Real-time clusterings
- Classifying Objects in Images Using Deep Learning
- Mahotas
- Magenta
- Working with Big Data
- Courses on Hadoop
- Pydoop
- Recommendation engine
- W.I.L.L
- More resources
- Kaggle competitions
- Coursera 更新時間:2021-07-02 23:40:49