- Mastering Hadoop
- Sandeep Karanth
- 219字
- 2021-08-06 19:52:58
Chapter 1. Hadoop 2.X
"There's nothing that cannot be found through some search engine or on the Internet somewhere." |
||
--Eric Schmidt, Executive Chairman, Google |
Hadoop is the de facto open source framework used in the industry for large scale, massively parallel, and distributed data processing. It provides a computation layer for parallel and distributed computation processing. Closely associated with the computation layer is a highly fault-tolerant data storage layer, the Hadoop Distributed File System (HDFS). Both the computation and data layers run on commodity hardware, which is inexpensive, easily available, and compatible with other similar hardware.
In this chapter, we will look at the journey of Hadoop, with a focus on the features that make it enterprise-ready. Hadoop, with 6 years of development and deployment under its belt, has moved from a framework that supports the MapReduce paradigm exclusively to a more generic cluster-computing framework. This chapter covers the following topics:
- An outline of Hadoop's code evolution, with major milestones highlighted
- An introduction to the changes that Hadoop has undergone as it has moved from 1.X releases to 2.X releases, and how it is evolving into a generic cluster-computing framework
- An introduction to the options available for enterprise-grade Hadoop, and the parameters for their evaluation
- An overview of a few popular enterprise-ready Hadoop distributions
- 基于C語言的程序設計
- 大學計算機基礎:基礎理論篇
- Deep Learning Quick Reference
- Google App Inventor
- 最簡數據挖掘
- JBoss ESB Beginner’s Guide
- 西門子S7-200 SMART PLC實例指導學與用
- CentOS 8 Essentials
- Apache Superset Quick Start Guide
- 單片機技術一學就會
- 網絡服務搭建、配置與管理大全(Linux版)
- Spatial Analytics with ArcGIS
- 啊哈C!思考快你一步
- Mastering Geospatial Analysis with Python
- 傳感器原理與工程應用