- Apache Hive Essentials
- Dayong Du
- 266字
- 2021-07-23 20:25:28
Overview of the Hadoop ecosystem
Hadoop was first released by Apache in 2011 as version 1.0.0. It only contained HDFS and MapReduce. Hadoop was designed as both a computing (MapReduce) and storage (HDFS) platform from the very beginning. With the increasing need for big data analysis, Hadoop attracts lots of other software to resolve big data questions together and merges to a Hadoop-centric big data ecosystem. The following diagram gives a brief introduction to the Hadoop ecosystem and the core software or components in the ecosystems:

The Hadoop ecosystem
In the current Hadoop ecosystem, HDFS is still the major storage option. On top of it, snappy, RCFile, Parquet, and ORCFile could be used for storage optimization. Core Hadoop MapReduce released a version 2.0 called Yarn for better performance and scalability. Spark and Tez as solutions for real-time processing are able to run on the Yarn to work with Hadoop closely. HBase is a leading NoSQL database, especially when there is a NoSQL database request on the deployed Hadoop clusters. Sqoop is still one of the leading and matured tools for exchanging data between Hadoop and relational databases. Flume is a matured distributed and reliable log-collecting tool to move or collect data to HDFS. Impala and Presto query directly against the data on HDFS for better performance. However, Hortonworks focuses on Stringer initiatives to make Hive 100 times faster. In addition, Hive over Spark and Hive over Tez offer a choice for users to run Hive on other computing frameworks rather than MapReduce. As a result, Hive is playing more important roles in the ecosystem than ever.
- C# 7 and .NET Core Cookbook
- 觸·心:DT時代的大數據精準營銷
- C語言程序設計實訓教程
- Clojure for Domain:specific Languages
- FFmpeg入門詳解:音視頻流媒體播放器原理及應用
- 算法精粹:經典計算機科學問題的Python實現
- Vue.js 3.0源碼解析(微課視頻版)
- Building Cross-Platform Desktop Applications with Electron
- Python機器學習經典實例
- Asynchronous Android Programming(Second Edition)
- 焊接機器人系統操作、編程與維護
- Visual C++開發入行真功夫
- Nginx Lua開發實戰
- Java EE企業級應用開發教程(Spring+Spring MVC+MyBatis)
- 響應式Web設計:HTML5和CSS3實戰(第2版)