最新章節
- Index
- Understanding the future of optimization – project Tungsten
- Optimizing the level of parallelism
- Optimizing garbage collection
- Using serialization to improve performance
- Using compression to improve performance
品牌:中圖公司
上架時間:2021-07-16 11:13:14
出版社:Packt Publishing
本書數字版權由中圖公司提供,并由其授權上海閱文信息技術有限公司制作發行
- Index 更新時間:2021-07-16 13:44:17
- Understanding the future of optimization – project Tungsten
- Optimizing the level of parallelism
- Optimizing garbage collection
- Using serialization to improve performance
- Using compression to improve performance
- Optimizing memory
- Introduction
- Chapter 12. Optimizations and Performance Tuning
- Performing neighborhood aggregation
- Finding connected components
- Using PageRank
- Fundamental operations on graphs
- Introduction
- Chapter 11. Graph Processing Using GraphX
- Collaborative filtering using implicit feedback
- Collaborative filtering using explicit feedback
- Introduction
- Chapter 10. Recommender Systems
- Dimensionality reduction with singular value decomposition
- Dimensionality reduction with principal component analysis
- Clustering using k-means
- Introduction
- Chapter 9. Unsupervised Learning with MLlib
- Doing classification with Na?ve Bayes
- Doing classification using Gradient Boosted Trees
- Doing classification using Random Forests
- Doing classification using decision trees
- Doing binary classification using SVM
- Doing classification using logistic regression
- Introduction
- Chapter 8. Supervised Learning with MLlib – Classification
- Doing ridge regression
- Doing linear regression with lasso
- Understanding cost function
- Using linear regression
- Introduction
- Chapter 7. Supervised Learning with MLlib – Regression
- Creating machine learning pipelines using ML
- Doing hypothesis testing
- Calculating correlation
- Calculating summary statistics
- Creating matrices
- Creating a labeled point
- Creating vectors
- Introduction
- Chapter 6. Getting Started with Machine Learning Using MLlib
- Streaming using Kafka
- Streaming Twitter data
- Word count using Streaming
- Introduction
- Chapter 5. Spark Streaming
- Loading and saving data from an arbitrary source
- Loading and saving data from relational databases
- Loading and saving data using the JSON format
- Loading and saving data using the Parquet format
- Programmatically specifying the schema
- Inferring schema using case classes
- Creating HiveContext
- Understanding the Catalyst optimizer
- Introduction
- Chapter 4. Spark SQL
- Loading data from relational databases
- Loading data from Apache Cassandra
- Loading data from Amazon S3
- Loading data from HDFS using a custom InputFormat
- Loading data from HDFS
- Loading data from the local filesystem
- Introduction
- Chapter 3. External Data Sources
- Developing a Spark application in IntelliJ IDEA with SBT
- Developing a Spark application in IntelliJ IDEA with Maven
- Developing Spark applications in Eclipse with SBT
- Developing Spark applications in Eclipse with Maven
- Exploring the Spark shell
- Introduction
- Chapter 2. Developing Applications with Spark
- Using Tachyon as an off-heap storage layer
- Deploying on a cluster with YARN
- Deploying on a cluster with Mesos
- Deploying on a cluster in standalone mode
- Launching Spark on Amazon EC2
- Building the Spark source code with Maven
- Installing Spark from binaries
- Introduction
- Chapter 1. Getting Started with Apache Spark
- Preface
- www.PacktPub.com
- About the Reviewers
- About the Author
- Credits
- 版權信息
- 封面
- 封面
- 版權信息
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Preface
- Chapter 1. Getting Started with Apache Spark
- Introduction
- Installing Spark from binaries
- Building the Spark source code with Maven
- Launching Spark on Amazon EC2
- Deploying on a cluster in standalone mode
- Deploying on a cluster with Mesos
- Deploying on a cluster with YARN
- Using Tachyon as an off-heap storage layer
- Chapter 2. Developing Applications with Spark
- Introduction
- Exploring the Spark shell
- Developing Spark applications in Eclipse with Maven
- Developing Spark applications in Eclipse with SBT
- Developing a Spark application in IntelliJ IDEA with Maven
- Developing a Spark application in IntelliJ IDEA with SBT
- Chapter 3. External Data Sources
- Introduction
- Loading data from the local filesystem
- Loading data from HDFS
- Loading data from HDFS using a custom InputFormat
- Loading data from Amazon S3
- Loading data from Apache Cassandra
- Loading data from relational databases
- Chapter 4. Spark SQL
- Introduction
- Understanding the Catalyst optimizer
- Creating HiveContext
- Inferring schema using case classes
- Programmatically specifying the schema
- Loading and saving data using the Parquet format
- Loading and saving data using the JSON format
- Loading and saving data from relational databases
- Loading and saving data from an arbitrary source
- Chapter 5. Spark Streaming
- Introduction
- Word count using Streaming
- Streaming Twitter data
- Streaming using Kafka
- Chapter 6. Getting Started with Machine Learning Using MLlib
- Introduction
- Creating vectors
- Creating a labeled point
- Creating matrices
- Calculating summary statistics
- Calculating correlation
- Doing hypothesis testing
- Creating machine learning pipelines using ML
- Chapter 7. Supervised Learning with MLlib – Regression
- Introduction
- Using linear regression
- Understanding cost function
- Doing linear regression with lasso
- Doing ridge regression
- Chapter 8. Supervised Learning with MLlib – Classification
- Introduction
- Doing classification using logistic regression
- Doing binary classification using SVM
- Doing classification using decision trees
- Doing classification using Random Forests
- Doing classification using Gradient Boosted Trees
- Doing classification with Na?ve Bayes
- Chapter 9. Unsupervised Learning with MLlib
- Introduction
- Clustering using k-means
- Dimensionality reduction with principal component analysis
- Dimensionality reduction with singular value decomposition
- Chapter 10. Recommender Systems
- Introduction
- Collaborative filtering using explicit feedback
- Collaborative filtering using implicit feedback
- Chapter 11. Graph Processing Using GraphX
- Introduction
- Fundamental operations on graphs
- Using PageRank
- Finding connected components
- Performing neighborhood aggregation
- Chapter 12. Optimizations and Performance Tuning
- Introduction
- Optimizing memory
- Using compression to improve performance
- Using serialization to improve performance
- Optimizing garbage collection
- Optimizing the level of parallelism
- Understanding the future of optimization – project Tungsten
- Index 更新時間:2021-07-16 13:44:17