舉報

會員
Hands-On Data Science with R
Risthemostwidelyusedprogramminglanguage,andwhenusedinassociationwithdatascience,thispowerfulcombinationwillsolvethecomplexitiesinvolvedwithunstructureddatasetsintherealworld.Thisbookcoverstheentiredatascienceecosystemforaspiringdatascientists,rightfromzerotoalevelwhereyouareconfidentenoughtogethands-onwithreal-worlddatascienceproblems.ThebookstartswithanintroductiontodatascienceandintroducesreaderstopopularRlibrariesforexecutingdatascienceroutinetasks.Thisbookcoversalltheimportantprocessesindatasciencesuchasdatagathering,cleaningdata,andthenuncoveringpatternsfromit.Youwillexplorealgorithmssuchasmachinelearningalgorithms,predictiveanalyticalmodels,andfinallydeeplearningalgorithms.YouwilllearntorunthemostpowerfulvisualizationpackagesavailableinRsoastoensurethatyoucaneasilyderiveinsightsfromyourdata.Towardstheend,youwillalsolearnhowtointegrateRwithSparkandHadoopandperformlarge-scaledataanalyticswithoutmuchcomplexity.
目錄(231章)
倒序
- coverpage
- Title Page
- About Packt
- Why subscribe?
- Packt.com
- Contributors
- About the authors
- About the reviewer
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Get in touch
- Reviews
- Getting Started with Data Science and R
- Introduction to data science
- Key components of data science
- Computer science
- Predictive analytics (machine learning)
- Domain knowledge
- Active domains of data science
- Finance
- Healthcare
- Pharmaceuticals
- Government
- Manufacturing and retail
- Web industry
- Other industries
- Solving problems with data science
- Using R for data science
- Key features of R
- Our first R program
- UN development index
- Summary
- Quiz
- Descriptive and Inferential Statistics
- Measures of central tendency and dispersion
- Measures of central tendency
- Calculating mean median and mode with base R
- Measures of dispersion
- Useful functions to draw automated summaries
- Statistical hypothesis testing
- Running t-tests with R
- Decision rule – a brief overview of the p-value approach
- Be careful
- Running z-tests with R
- Elaborating a little longer
- A/B testing – a brief introduction and a practical example with R
- Summary
- Quiz
- Data Wrangling with R
- Introduction to data wrangling with R
- Data types formats and sources
- Data extraction transformation and load
- Basic tools of data wrangling
- Using base R for data manipulation and analysis
- Applying families of functions
- Aggregation functions
- Merging DataFrames
- Using tibble and dplyr for data manipulation
- Basic dplyr usage
- Using select
- Filtering with filter
- Using arrange for sorting
- Summarise
- Sampling data
- The tidyr package
- Converting wide tables into long tables
- Converting wide tables into long tables
- Joining tables
- dbplyr – databases and dplyr
- Using data.table for data manipulation
- Grouping operations
- Adding a column
- Ordering columns
- What is the advantage of searching using key by?
- Creating new columns in data.table
- Deleting a column
- Pivots on data.table
- The melt functionality
- Reading and writing files with data.table
- A special note on dates and/or time
- Miscellaneous topics
- Checking data quality
- Reading other file formats – Excel SAS and other data sources
- On-disk formats
- Working with web data
- Web APIs
- Tutorial – looking at airline flight times data
- Summary
- Quiz
- KDD Data Mining and Text Mining
- Good practices of KDD and data mining
- Stages of KDD
- Scraping a dwarf name
- Retrieving text from the web
- Legality of web scraping
- Web scraping made easy with rvest
- Retrieving tweets from R community
- Creating your Twitter application
- Fetching the number of tweets
- Cleaning and transforming data
- Looking for patterns – peeking visualizing and clustering data
- Peeking data
- Visualizing data
- Cluster analysis
- Summary
- Quiz
- Data Analysis with R
- Preparing data for analysis
- Data categories
- Data types in R
- Reading data
- Managing data issues
- Mixed data types
- Missing data
- Handling strings and dates
- Handling dates using POSIXct or POSIXlt
- Handling strings in R
- Reading data
- Combining strings
- Simple pattern matching and replacement with R
- Printing results
- Data visualisation
- Types of charts – basic primer
- Histograms
- Line plots
- Scatter plots
- Boxplots
- Bar charts
- Heatmaps
- Summarizing data
- Saving analysis for future work
- Packrat
- Checkpoint
- Rocker
- Summary
- Quiz
- Machine Learning with R
- What is machine learning?
- Machine learning everywhere
- Machine learning vocabulary
- Generic problems solved by machine learning
- Linear regression with R
- Tricks for lm
- Tree models
- Strengths and weakness
- The Chilean plebiscite data
- Starting with decision trees
- Growing trees with tree and rpart
- Random forests – a collection of trees
- Support vector machines
- What about regressions?
- Hierarchical and k-means clustering
- Neural networks
- Introduction to feedforward neural networks with R
- Summary
- Quiz
- Forecasting and ML App with R
- The UI and server
- Forecasting machine learning application
- Application details
- Summary
- Quiz
- Neural Networks and Deep Learning
- Daily neural nets
- Overview – NNs and deep learning
- Neuroscience inspiration
- ANN nodes
- Activation functions
- Layers
- Training algorithms
- NNs with Keras
- Getting things ready for Keras
- Getting practical with Keras
- Further tips
- Summary
- Quiz
- Markovian in R
- Markovian-type models
- Markovian models – real-world applications
- The Markov chain
- Programming an HMM with R
- Summary
- Quiz
- Visualizing Data
- Retrieving and cleaning data
- Crafting visualizations
- Summary
- Quiz
- Going to Production with R
- What is R Shiny?
- How to build a Shiny app
- Building an application inside R
- The reactive and isolate functions
- The observeEvent and eventReactive functions
- Approach for creating a data product from statistical modeling and web UI
- Some advice about Shiny
- Summary
- Quiz
- Large Scale Data Analytics with Hadoop
- Installing the package and Spark
- Manipulating Spark data using both dplyr and SQL
- Filtering and aggregating Spark datasets
- Using Spark machine learning or H2O Sparking Water
- Providing interfaces to Spark packages
- Spark DataFrames within the RStudio IDE
- Summary
- Quiz
- R on Cloud
- Cloud computing
- Cloud types
- Things to look for
- Why Azure?
- Azure registration
- Azure Machine Learning Studio
- How modules work
- Building an experiment that uses R
- Summary
- Quiz
- The Road Ahead
- Growing your skills
- Gathering data
- Content to stay tuned to
- Meeting Stack Overflow
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-06-10 19:13:14
推薦閱讀
- 腦動力:Linux指令速查效率手冊
- Hands-On Internet of Things with MQTT
- Dreamweaver CS3網頁制作融會貫通
- Blockchain Quick Start Guide
- Visual C# 2008開發技術實例詳解
- STM32G4入門與電機控制實戰:基于X-CUBE-MCSDK的無刷直流電機與永磁同步電機控制實現
- 計算機系統結構
- Photoshop CS3圖層、通道、蒙版深度剖析寶典
- 精通數據科學算法
- Excel 2007技巧大全
- 筆記本電腦維修90個精選實例
- Mastering GitLab 12
- ZigBee無線通信技術應用開發
- MPC5554/5553微處理器揭秘
- 電氣控制及Micro800 PLC程序設計
- Learning Cassandra for Administrators
- Win 7二十一
- 網絡規劃與設計
- Microsoft Power BI Complete Reference
- 單片機C語言編程實踐
- 51單片機C語言應用開發三位一體實戰精講
- 人工智能:商業化落地實戰
- PLC與步進伺服快速入門與實踐
- SQL Server 2017 Administrator's Guide
- Mastering VMware Horizon 7.8
- INSTANT PostgreSQL Backup and Restore How-to
- 自動化生產線安裝與調試
- 常用算法深入學習實錄
- 工業機器人:產品包裝典型應用精析
- Practical Machine Learning Cookbook