目錄(113章)
倒序
- coverpage
- Practical Data Analysis Cookbook
- Credits
- About the Author
- Acknowledgments
- About the Reviewers
- www.PacktPub.com
- Support files eBooks discount offers and more
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Sections
- Conventions
- Reader feedback
- Customer support
- Chapter 1. Preparing the Data
- Introduction
- Reading and writing CSV/TSV files with Python
- Reading and writing JSON files with Python
- Reading and writing Excel files with Python
- Reading and writing XML files with Python
- Retrieving HTML pages with pandas
- Storing and retrieving from a relational database
- Storing and retrieving from MongoDB
- Opening and transforming data with OpenRefine
- Exploring the data with Open Refine
- Removing duplicates
- Using regular expressions and GREL to clean up data
- Imputing missing observations
- Normalizing and standardizing the features
- Binning the observations
- Encoding categorical variables
- Chapter 2. Exploring the Data
- Introduction
- Producing descriptive statistics
- Exploring correlations between features
- Visualizing the interactions between features
- Producing histograms
- Creating multivariate charts
- Sampling the data
- Splitting the dataset into training cross-validation and testing
- Chapter 3. Classification Techniques
- Introduction
- Testing and comparing the models
- Classifying with Na?ve Bayes
- Using logistic regression as a universal classifier
- Utilizing Support Vector Machines as a classification engine
- Classifying calls with decision trees
- Predicting subscribers with random tree forests
- Employing neural networks to classify calls
- Chapter 4. Clustering Techniques
- Introduction
- Assessing the performance of a clustering method
- Clustering data with k-means algorithm
- Finding an optimal number of clusters for k-means
- Discovering clusters with mean shift clustering model
- Building fuzzy clustering model with c-means
- Using hierarchical model to cluster your data
- Finding groups of potential subscribers with DBSCAN and BIRCH algorithms
- Chapter 5. Reducing Dimensions
- Introduction
- Creating three-dimensional scatter plots to present principal components
- Reducing the dimensions using the kernel version of PCA
- Using Principal Component Analysis to find things that matter
- Finding the principal components in your data using randomized PCA
- Extracting the useful dimensions using Linear Discriminant Analysis
- Using various dimension reduction techniques to classify calls using the k-Nearest Neighbors classification model
- Chapter 6. Regression Methods
- Introduction
- Identifying and tackling multicollinearity
- Building Linear Regression model
- Using OLS to forecast how much electricity can be produced
- Estimating the output of an electric plant using CART
- Employing the kNN model in a regression problem
- Applying the Random Forest model to a regression analysis
- Gauging the amount of electricity a plant can produce using SVMs
- Training a Neural Network to predict the output of a power plant
- Chapter 7. Time Series Techniques
- Introduction
- Handling date objects in Python
- Understanding time series data
- Smoothing and transforming the observations
- Filtering the time series data
- Removing trend and seasonality
- Forecasting the future with ARMA and ARIMA models
- Chapter 8. Graphs
- Introduction
- Handling graph objects in Python with NetworkX
- Using Gephi to visualize graphs
- Identifying people whose credit card details were stolen
- Identifying those responsible for stealing the credit cards
- Chapter 9. Natural Language Processing
- Introduction
- Reading raw text from the Web
- Tokenizing and normalizing text
- Identifying parts of speech handling n-grams and recognizing named entities
- Identifying the topic of an article
- Identifying the sentence structure
- Classifying movies based on their reviews
- Chapter 10. Discrete Choice Models
- Introduction
- Preparing a dataset to estimate discrete choice models
- Estimating the well-known Multinomial Logit model
- Testing for violations of the Independence from Irrelevant Alternatives
- Handling IIA violations with the Nested Logit model
- Managing sophisticated substitution patterns with the Mixed Logit model
- Chapter 11. Simulations
- Introduction
- Using SimPy to simulate the refueling process of a gas station
- Simulating out-of-energy occurrences for an electric car
- Determining if a population of sheep is in danger of extinction due to a wolf pack
- Index 更新時間:2021-07-16 11:14:22
推薦閱讀
- 深入理解Android(卷I)
- 測試驅(qū)動開發(fā):入門、實戰(zhàn)與進(jìn)階
- 程序員面試算法寶典
- 我的第一本算法書
- Data Analysis with Stata
- Learning Python by Building Games
- Visual FoxPro程序設(shè)計
- BIM概論及Revit精講
- Scala for Machine Learning(Second Edition)
- Java零基礎(chǔ)實戰(zhàn)
- Mastering Web Application Development with AngularJS
- Mastering HTML5 Forms
- Arduino機器人系統(tǒng)設(shè)計及開發(fā)
- Java 9 with JShell
- Spring Boot從入門到實戰(zhàn)
- Serverless從入門到進(jìn)階:架構(gòu)、原理與實踐
- Python自動化開發(fā)實戰(zhàn)
- Visual FoxPro程序設(shè)計(第二版)
- HTML5+CSS3網(wǎng)頁布局項目化教程
- OpenCV Computer Vision with Python
- Python金融風(fēng)控策略實踐
- C語言程序設(shè)計教程
- Introduction to JVM Languages
- 計算機應(yīng)用基礎(chǔ)項目化教程
- C語言從入門到項目實踐(超值版)
- 微信公眾平臺應(yīng)用開發(fā):方法、技巧與案例
- Developing AR Games for iOS and Android
- Java EE輕量級框架應(yīng)用開發(fā)教程
- Python for Google App Engine
- C語言程序設(shè)計實驗指導(dǎo)與習(xí)題精選