舉報

會員
Learning Data Mining with Python(Second Edition)
最新章節:
Coursera
IfyouareaPythonprogrammerwhowantstogetstartedwithdatamining,thenthisbookisforyou.IfyouareadataanalystwhowantstoleveragethepowerofPythontoperformdataminingefficiently,thisbookwillalsohelpyou.Nopreviousexperiencewithdataminingisexpected.
目錄(268章)
倒序
- 封面
- 版權信息
- Credits
- About the Author
- About the Reviewer
- www.PacktPub.com
- Customer Feedback
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Downloading the example code
- Errata
- Piracy
- Questions
- Getting Started with Data Mining
- Introducing data mining
- Using Python and the Jupyter Notebook
- Installing Python
- Installing Jupyter Notebook
- Installing scikit-learn
- A simple affinity analysis example
- What is affinity analysis?
- Product recommendations
- Loading the dataset with NumPy
- Downloading the example code
- Implementing a simple ranking of rules
- Ranking to find the best rules
- A simple classification example
- What is classification?
- Loading and preparing the dataset
- Implementing the OneR algorithm
- Testing the algorithm
- Summary
- Classifying with scikit-learn Estimators
- scikit-learn estimators
- Nearest neighbors
- Distance metrics
- Loading the dataset
- Moving towards a standard workflow
- Running the algorithm
- Setting parameters
- Preprocessing
- Standard pre-processing
- Putting it all together
- Pipelines
- Summary
- Predicting Sports Winners with Decision Trees
- Loading the dataset
- Collecting the data
- Using pandas to load the dataset
- Cleaning up the dataset
- Extracting new features
- Decision trees
- Parameters in decision trees
- Using decision trees
- Sports outcome prediction
- Putting it all together
- Random forests
- How do ensembles work?
- Setting parameters in Random Forests
- Applying random forests
- Engineering new features
- Summary
- Recommending Movies Using Affinity Analysis
- Affinity analysis
- Algorithms for affinity analysis
- Overall methodology
- Dealing with the movie recommendation problem
- Obtaining the dataset
- Loading with pandas
- Sparse data formats
- Understanding the Apriori algorithm and its implementation
- Looking into the basics of the Apriori algorithm
- Implementing the Apriori algorithm
- Extracting association rules
- Evaluating the association rules
- Summary
- Features and scikit-learn Transformers
- Feature extraction
- Representing reality in models
- Common feature patterns
- Creating good features
- Feature selection
- Selecting the best individual features
- Feature creation
- Principal Component Analysis
- Creating your own transformer
- The transformer API
- Implementing a Transformer
- Unit testing
- Putting it all together
- Summary
- Social Media Insight using Naive Bayes
- Disambiguation
- Downloading data from a social network
- Loading and classifying the dataset
- Creating a replicable dataset from Twitter
- Text transformers
- Bag-of-words models
- n-gram features
- Other text features
- Naive Bayes
- Understanding Bayes' theorem
- Naive Bayes algorithm
- How it works
- Applying of Naive Bayes
- Extracting word counts
- Converting dictionaries to a matrix
- Putting it all together
- Evaluation using the F1-score
- Getting useful features from models
- Summary
- Follow Recommendations Using Graph Mining
- Loading the dataset
- Classifying with an existing model
- Getting follower information from Twitter
- Building the network
- Creating a graph
- Creating a similarity graph
- Finding subgraphs
- Connected components
- Optimizing criteria
- Summary
- Beating CAPTCHAs with Neural Networks
- Artificial neural networks
- An introduction to neural networks
- Creating the dataset
- Drawing basic CAPTCHAs
- Splitting the image into individual letters
- Creating a training dataset
- Training and classifying
- Back-propagation
- Predicting words
- Improving accuracy using a dictionary
- Ranking mechanisms for word similarity
- Putting it all together
- Summary
- Authorship Attribution
- Attributing documents to authors
- Applications and use cases
- Authorship attribution
- Getting the data
- Using function words
- Counting function words
- Classifying with function words
- Support Vector Machines
- Classifying with SVMs
- Kernels
- Character n-grams
- Extracting character n-grams
- The Enron dataset
- Accessing the Enron dataset
- Creating a dataset loader
- Putting it all together
- Evaluation
- Summary
- Clustering News Articles
- Trending topic discovery
- Using a web API to get data
- Reddit as a data source
- Getting the data
- Extracting text from arbitrary websites
- Finding the stories in arbitrary websites
- Extracting the content
- Grouping news articles
- The k-means algorithm
- Evaluating the results
- Extracting topic information from clusters
- Using clustering algorithms as transformers
- Clustering ensembles
- Evidence accumulation
- How it works
- Implementation
- Online learning
- Implementation
- Summary
- Object Detection in Images using Deep Neural Networks
- Object classification
- Use cases
- Application scenario
- Deep neural networks
- Intuition
- Implementing deep neural networks
- An Introduction to TensorFlow
- Using Keras
- Convolutional Neural Networks
- GPU optimization
- When to use GPUs for computation
- Running our code on a GPU
- Setting up the environment
- Application
- Getting the data
- Creating the neural network
- Putting it all together
- Summary
- Working with Big Data
- Big data
- Applications of big data
- MapReduce
- The intuition behind MapReduce
- A word count example
- Hadoop MapReduce
- Applying MapReduce
- Getting the data
- Naive Bayes prediction
- The mrjob package
- Extracting the blog posts
- Training Naive Bayes
- Putting it all together
- Training on Amazon's EMR infrastructure
- Summary
- Next Steps...
- Getting Started with Data Mining
- Scikit-learn tutorials
- Extending the Jupyter Notebook
- More datasets
- Other Evaluation Metrics
- More application ideas
- Classifying with scikit-learn Estimators
- Scalability with the nearest neighbor
- More complex pipelines
- Comparing classifiers
- Automated Learning
- Predicting Sports Winners with Decision Trees
- More complex features
- Dask
- Research
- Recommending Movies Using Affinity Analysis
- New datasets
- The Eclat algorithm
- Collaborative Filtering
- Extracting Features with Transformers
- Adding noise
- Vowpal Wabbit
- word2vec
- Social Media Insight Using Naive Bayes
- Spam detection
- Natural language processing and part-of-speech tagging
- Discovering Accounts to Follow Using Graph Mining
- More complex algorithms
- NetworkX
- Beating CAPTCHAs with Neural Networks
- Better (worse?) CAPTCHAs
- Deeper networks
- Reinforcement learning
- Authorship Attribution
- Increasing the sample size
- Blogs dataset
- Local n-grams
- Clustering News Articles
- Clustering Evaluation
- Temporal analysis
- Real-time clusterings
- Classifying Objects in Images Using Deep Learning
- Mahotas
- Magenta
- Working with Big Data
- Courses on Hadoop
- Pydoop
- Recommendation engine
- W.I.L.L
- More resources
- Kaggle competitions
- Coursera 更新時間:2021-07-02 23:40:49
推薦閱讀
- Practical Data Analysis Cookbook
- MySQL數據庫應用與管理 第2版
- 無代碼編程:用云表搭建企業數字化管理平臺
- Learn Scala Programming
- 用Python實現深度學習框架
- Mastering JavaScript Design Patterns(Second Edition)
- RealSenseTM互動開發實戰
- Unity&VR游戲美術設計實戰
- Test-Driven JavaScript Development
- Illustrator CS6設計與應用任務教程
- Web程序設計:ASP.NET(第2版)
- HTML+CSS+JavaScript網頁制作:從入門到精通(第4版)
- 從零學Java設計模式
- Appcelerator Titanium:Patterns and Best Practices
- Node.js應用開發
- Java EE 7 Development with WildFly
- MySQL數據庫教程(視頻指導版)
- MySQL核心技術與最佳實踐
- 高性能MVVM框架的設計與實現:San
- Learning ClojureScript
- Learning Java by Building Android Games
- Mastering PostCSS for Web Design
- C++項目開發全程實錄(第2版)
- Python編程300例:快速構建可執行高質量代碼
- 新印象:中文版Sketch圖標與UI界面設計實例教程
- Python深度學習從原理到應用
- 移動端機器學習實戰
- 開源實時以太網POWERLINK詳解
- Scrapy網絡爬蟲開發實戰
- Android移動性能實戰