官术网_书友最值得收藏!

Chapter 1. Hadoop 2.X

 

"There's nothing that cannot be found through some search engine or on the Internet somewhere."

 
  --Eric Schmidt, Executive Chairman, Google

Hadoop is the de facto open source framework used in the industry for large scale, massively parallel, and distributed data processing. It provides a computation layer for parallel and distributed computation processing. Closely associated with the computation layer is a highly fault-tolerant data storage layer, the Hadoop Distributed File System (HDFS). Both the computation and data layers run on commodity hardware, which is inexpensive, easily available, and compatible with other similar hardware.

In this chapter, we will look at the journey of Hadoop, with a focus on the features that make it enterprise-ready. Hadoop, with 6 years of development and deployment under its belt, has moved from a framework that supports the MapReduce paradigm exclusively to a more generic cluster-computing framework. This chapter covers the following topics:

  • An outline of Hadoop's code evolution, with major milestones highlighted
  • An introduction to the changes that Hadoop has undergone as it has moved from 1.X releases to 2.X releases, and how it is evolving into a generic cluster-computing framework
  • An introduction to the options available for enterprise-grade Hadoop, and the parameters for their evaluation
  • An overview of a few popular enterprise-ready Hadoop distributions
主站蜘蛛池模板: 沅陵县| 桂东县| 商水县| 樟树市| 潜江市| 云梦县| 鹤峰县| 凤山市| 年辖:市辖区| 高碑店市| 许昌市| 平利县| 昆明市| 轮台县| 洛宁县| 黔西| 锦屏县| 灵川县| 花莲市| 郧西县| 永靖县| 普陀区| 普宁市| 苗栗市| 杨浦区| 浦江县| 开鲁县| 六盘水市| 杭锦后旗| 新津县| 海原县| 鸡泽县| 海门市| 黄浦区| 雅安市| 萝北县| 仙游县| 长葛市| 平陆县| 天镇县| 呼伦贝尔市|