舉報(bào)

會(huì)員
Hands-On Data Analysis with Scala
Efficientbusinessdecisionswithanaccuratesenseofbusinessdatahelpsindeliveringbetterperformanceacrossproductsandservices.ThisbookhelpsyoutoleveragethepopularScalalibrariesandtoolsforperformingcoredataanalysistaskswithease.Thebookbeginswithaquickoverviewofthebuildingblocksofastandarddataanalysisprocess.YouwilllearntoperformbasictaskslikeExtraction,Staging,Validation,Cleaning,andShapingofdatasets.Youwilllaterdeepdiveintothedataexplorationandvisualizationareasofthedataanalysislifecycle.YouwillmakeuseofpopularScalalibrarieslikeSaddle,Breeze,Vegas,andPredictionIOforprocessingyourdatasets.Youwilllearnstatisticalmethodsforderivingmeaningfulinsightsfromdata.YouwillalsolearntocreateapplicationsforApacheSpark2.xoncomplexdataanalysis,inreal-time.Youwilldiscovertraditionalmachinelearningtechniquesfordoingdataanalysis.Furthermore,youwillalsobeintroducedtoneuralnetworksanddeeplearningfromadataanalysisstandpoint.Bytheendofthisbook,youwillbecapableofhandlinglargesetsofstructuredandunstructureddata,performexploratoryanalysis,andbuildingefficientScalaapplicationsfordiscoveringanddeliveringinsights
目錄(158章)
倒序
- coverpage
- Title Page
- Copyright and Credits
- Hands-On Data Analysis with Scala
- Dedication
- About Packt
- Why subscribe?
- Packt.com
- Contributors
- About the author
- About the reviewer
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Get in touch
- Reviews
- Section 1: Scala and Data Analysis Life Cycle
- Scala Overview
- Getting started with Scala
- Running Scala code online
- Scastie
- ScalaFiddle
- Installing Scala on your computer
- Installing command-line tools
- Installing IDE
- Overview of object-oriented and functional programming
- Object-oriented programming using Scala
- Functional programming using Scala
- Scala case classes and the collection API
- Scala case classes
- Scala collection API
- Array
- List
- Map
- Overview of Scala libraries for data analysis
- Apache Spark
- Breeze
- Breeze-viz
- DeepLearning
- Epic
- Saddle
- Scalalab
- Smile
- Vegas
- Summary
- Data Analysis Life Cycle
- Data journey
- Sourcing data
- Data formats
- XML
- JSON
- CSV
- Understanding data
- Using statistical methods for data exploration
- Using Scala
- Other Scala tools
- Using data visualization for data exploration
- Using the vegas-viz library for data visualization
- Other libraries for data visualization
- Using ML to learn from data
- Setting up Smile
- Running Smile
- Creating a data pipeline
- Summary
- Data Ingestion
- Data extraction
- Pull-oriented data extraction
- Push-oriented data delivery
- Data staging
- Why is the staging important?
- Cleaning and normalizing
- Enriching
- Organizing and storing
- Summary
- Data Exploration and Visualization
- Sampling data
- Selecting the sample
- Selecting samples using Saddle
- Performing ad hoc analysis
- Finding a relationship between data elements
- Visualizing data
- Vegas viz for data visualization
- Spark Notebook for data visualization
- Downloading and installing Spark Notebook
- Creating a Spark Notebook with simple visuals
- More charts with Spark Notebook
- Box plot
- Histogram
- Bubble chart
- Summary
- Applying Statistics and Hypothesis Testing
- Basics of statistics
- Summary level statistics
- Correlation statistics
- Vector level statistics
- Random data generation
- Pseudorandom numbers
- Random numbers with normal distribution
- Random numbers with Poisson distribution
- Hypothesis testing
- Summary
- Section 2: Advanced Data Analysis and Machine Learning
- Introduction to Spark for Distributed Data Analysis
- Spark setup and overview
- Spark core concepts
- Spark Datasets and DataFrames
- Sourcing data using Spark
- Parquet file format
- Avro file format
- Spark JDBC integration
- Using Spark to explore data
- Summary
- Traditional Machine Learning for Data Analysis
- ML overview
- Characteristics of ML
- Categories or types of ML
- Decision trees
- Implementing decision trees
- Decision tree algorithms
- Implementing decision tree algorithms in our example
- Evaluating the results
- Using our model with a decision tree
- Random forest
- Random forest algorithms
- Ridge and lasso regression
- Characteristics of ridge regression
- Characteristics of lasso regression
- k-means cluster analysis
- Natural language processing for data analysis
- Algorithm selections
- Summary
- Section 3: Real-Time Data Analysis and Scalability
- Near Real-Time Data Analysis Using Streaming
- Overview of streaming
- Spark Streaming overview
- Word count using pure Scala
- Word count using Scala and Spark
- Word count using Scala and Spark Streaming
- Deep dive into the Spark Streaming solution
- Streaming a k-means clustering algorithm using Spark
- Streaming linear regression using Spark
- Summary
- Working with Data at Scale
- Working with data at scale
- Cost considerations
- Data storage
- Data governance
- Reliability considerations
- Input data errors
- Processing failures
- Summary
- Another Book You May Enjoy
- Leave a review - let other readers know what you think 更新時(shí)間:2021-06-24 14:51:32
推薦閱讀
- Hadoop 2.x Administration Cookbook
- CSS全程指南
- ROS機(jī)器人編程與SLAM算法解析指南
- 小型電動(dòng)機(jī)實(shí)用設(shè)計(jì)手冊(cè)
- 視覺檢測(cè)技術(shù)及智能計(jì)算
- 悟透AutoCAD 2009完全自學(xué)手冊(cè)
- 啊哈C!思考快你一步
- Web璀璨:Silverlight應(yīng)用技術(shù)完全指南
- MPC5554/5553微處理器揭秘
- Hands-On Business Intelligence with Qlik Sense
- Serverless Design Patterns and Best Practices
- Instant Slic3r
- 菜鳥起飛五筆打字高手
- 教育創(chuàng)新與創(chuàng)新人才:信息技術(shù)人才培養(yǎng)改革之路(四)
- JSP網(wǎng)絡(luò)開發(fā)入門與實(shí)踐
- x86/x64體系探索及編程
- ARM嵌入式開發(fā)實(shí)例
- 當(dāng)產(chǎn)品經(jīng)理遇到人工智能
- AVR單片機(jī)C語言程序設(shè)計(jì)實(shí)例精粹
- 洞察大數(shù)據(jù)價(jià)值:SAS編程與數(shù)據(jù)挖掘
- Cassandra Design Patterns
- 博弈論與無線傳感器網(wǎng)絡(luò)安全
- 嵌入式系統(tǒng)開發(fā)
- 智能機(jī)器人制作完全手冊(cè)(第2版)
- 光固化3D打印技術(shù)
- 物聯(lián)網(wǎng)應(yīng)用與解決方案
- Mastering PostGIS
- R Graph Essentials
- Julia 1.0 Programming Cookbook
- Windows 7使用精解