會員

Practical Data Analysis Cookbook

Tomasz Drabas 著

更新時間：2021-07-16 11:14:22

開會員，本書免費讀 >

Ifyouareabeginnerorintermediate-levelprofessionalwhoislookingtosolveyourday-to-day,analyticalproblemswithPython,thisbookisforyou.Evenwithnopriorprogramminganddataanalyticsexperience,youwillbeabletofinisheachrecipeandlearnwhiledoingso.

目錄(113章)

倒序

coverpage
Practical Data Analysis Cookbook
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Support files eBooks discount offers and more
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Conventions
Reader feedback
Customer support
Chapter 1. Preparing the Data
Introduction
Reading and writing CSV/TSV files with Python
Reading and writing JSON files with Python
Reading and writing Excel files with Python
Reading and writing XML files with Python
Retrieving HTML pages with pandas
Storing and retrieving from a relational database
Storing and retrieving from MongoDB
Opening and transforming data with OpenRefine
Exploring the data with Open Refine
Removing duplicates
Using regular expressions and GREL to clean up data
Imputing missing observations
Normalizing and standardizing the features
Binning the observations
Encoding categorical variables
Chapter 2. Exploring the Data
Introduction
Producing descriptive statistics
Exploring correlations between features
Visualizing the interactions between features
Producing histograms
Creating multivariate charts
Sampling the data
Splitting the dataset into training cross-validation and testing
Chapter 3. Classification Techniques
Introduction
Testing and comparing the models
Classifying with Na?ve Bayes
Using logistic regression as a universal classifier
Utilizing Support Vector Machines as a classification engine
Classifying calls with decision trees
Predicting subscribers with random tree forests
Employing neural networks to classify calls
Chapter 4. Clustering Techniques
Introduction
Assessing the performance of a clustering method
Clustering data with k-means algorithm
Finding an optimal number of clusters for k-means
Discovering clusters with mean shift clustering model
Building fuzzy clustering model with c-means
Using hierarchical model to cluster your data
Finding groups of potential subscribers with DBSCAN and BIRCH algorithms
Chapter 5. Reducing Dimensions
Introduction
Creating three-dimensional scatter plots to present principal components
Reducing the dimensions using the kernel version of PCA
Using Principal Component Analysis to find things that matter
Finding the principal components in your data using randomized PCA
Extracting the useful dimensions using Linear Discriminant Analysis
Using various dimension reduction techniques to classify calls using the k-Nearest Neighbors classification model
Chapter 6. Regression Methods
Introduction
Identifying and tackling multicollinearity
Building Linear Regression model
Using OLS to forecast how much electricity can be produced
Estimating the output of an electric plant using CART
Employing the kNN model in a regression problem
Applying the Random Forest model to a regression analysis
Gauging the amount of electricity a plant can produce using SVMs
Training a Neural Network to predict the output of a power plant
Chapter 7. Time Series Techniques
Introduction
Handling date objects in Python
Understanding time series data
Smoothing and transforming the observations
Filtering the time series data
Removing trend and seasonality
Forecasting the future with ARMA and ARIMA models
Chapter 8. Graphs
Introduction
Handling graph objects in Python with NetworkX
Using Gephi to visualize graphs
Identifying people whose credit card details were stolen
Identifying those responsible for stealing the credit cards
Chapter 9. Natural Language Processing
Introduction
Reading raw text from the Web
Tokenizing and normalizing text
Identifying parts of speech handling n-grams and recognizing named entities
Identifying the topic of an article
Identifying the sentence structure
Classifying movies based on their reviews
Chapter 10. Discrete Choice Models
Introduction
Preparing a dataset to estimate discrete choice models
Estimating the well-known Multinomial Logit model
Testing for violations of the Independence from Irrelevant Alternatives
Handling IIA violations with the Nested Logit model
Managing sophisticated substitution patterns with the Mixed Logit model
Chapter 11. Simulations
Introduction
Using SimPy to simulate the refueling process of a gas station
Simulating out-of-energy occurrences for an electric car
Determining if a population of sheep is in danger of extinction due to a wolf pack
Index 更新時間：2021-07-16 11:14:22

官术网_书友最值得收藏!

Practical Data Analysis Cookbook