- Big Data Analytics
- Venkat Ankam
- 175字
- 2021-08-20 10:32:24
Summary
Apache Hadoop provides you with a reliable and scalable framework (HDFS) for Big Data storage and a powerful cluster resource management framework (YARN) to run and manage multiple Big Data applications. Apache Spark provides in-memory performance in Big Data processing and libraries and APIs for interactive exploratory analytics, real-time analytics, machine learning, and graph analytics. While MR was the primary processing engine on top of Hadoop, it had multiple drawbacks, such as poor performance and inflexibility in designing applications. Apache Spark is a replacement for MR. All MR-based tools, such as Hive, Pig, Mahout, and Crunch, have already started offering Apache Spark as an additional execution engine apart from MR.
Nowadays, Big Data projects are being implemented in many businesses, from large Fortune 500 companies to small start-ups. Organizations gain an edge if they can go from raw data to decisions quickly with easy-to-use tools to develop applications and explore data. Apache Spark will bring this speed and sophistication to Hadoop clusters.
In the next chapter, let's dive deep into Spark and learn Spark.
- VMware View Security Essentials
- ReSharper Essentials
- Python自然語(yǔ)言處理(微課版)
- Kinect for Windows SDK Programming Guide
- 深入淺出Serverless:技術(shù)原理與應(yīng)用實(shí)踐
- R大數(shù)據(jù)分析實(shí)用指南
- 數(shù)據(jù)結(jié)構(gòu)習(xí)題解析與實(shí)驗(yàn)指導(dǎo)
- 低代碼平臺(tái)開(kāi)發(fā)實(shí)踐:基于React
- 51單片機(jī)C語(yǔ)言開(kāi)發(fā)教程
- 一本書(shū)講透Java線(xiàn)程:原理與實(shí)踐
- Application Development with Parse using iOS SDK
- 百萬(wàn)在線(xiàn):大型游戲服務(wù)端開(kāi)發(fā)
- IBM RUP參考與認(rèn)證指南
- Scratch編程從入門(mén)到精通
- Developer,Advocate!