- Big Data Analytics
- Venkat Ankam
- 204字
- 2021-08-20 10:32:19
Preface
Big Data Analytics aims at providing the fundamentals of Apache Spark and Hadoop, and how they are integrated together with most commonly used tools and techniques in an easy way. All Spark components (Spark Core, Spark SQL, DataFrames, Datasets, Conventional Streaming, Structured Streaming, MLLib, GraphX, and Hadoop core components), HDFS, MapReduce, and Yarn are explored in great depth with implementation examples on Spark + Hadoop clusters.
The Big Data Analytics industry is moving away from MapReduce to Spark. So, the advantages of Spark over MapReduce are explained in great depth to reap the benefits of in-memory speeds. The DataFrames API, the Data Sources API, and the new Dataset API are explained for building Big Data analytical applications. Real-time data analytics using Spark Streaming with Apache Kafka and HBase is covered to help in building streaming applications. New structured streaming concept is explained with an Internet of Things (IOT) use case. Machine learning techniques are covered using MLLib, ML Pipelines and SparkR; Graph Analytics are covered with GraphX and GraphFrames components of Spark.
This book also introduces web based notebooks such as Jupyter, Apache Zeppelin, and data flow tool Apache NiFi to analyze and visualize data, offering Spark as a Service using Livy Server.
- The Complete Rust Programming Reference Guide
- Learning Data Mining with Python
- Getting Started with PowerShell
- Hadoop+Spark大數(shù)據分析實戰(zhàn)
- Instant RubyMotion App Development
- Linux命令行與shell腳本編程大全(第4版)
- Learn React with TypeScript 3
- 領域驅動設計:軟件核心復雜性應對之道(修訂版)
- C語言程序設計
- 搞定J2EE:Struts+Spring+Hibernate整合詳解與典型案例
- JavaScript程序設計(第2版)
- Java Web從入門到精通(第3版)
- 玩轉.NET Micro Framework移植:基于STM32F10x處理器
- 算法圖解
- 你真的會寫代碼嗎