官术网_书友最值得收藏!

The fundamentals of Hadoop

In 2006, Doug Cutting, the creator of Hadoop, was working at Yahoo!. He was actively engaged in an open source project called Nutch that involved the development of a large-scale web crawler. A web crawler at a high level is essentially software that can browse and index web pages, generally in an automatic manner, on the internet. Intuitively, this involves efficient management and computation across large volumes of data. In late January of 2006, Doug formally announced the start of Hadoop. The first line of the request, still available on the internet at https://issues.apache.org/jira/browse/INFRA-700, was The Lucene PMC has voted to split part of Nutch into a new subproject named Hadoop. And thus, Hadoop was born.

At the onset, Hadoop had two core components : Hadoop Distributed File System (HDFS) and MapReduce. This was the first iteration of Hadoop, also now known as Hadoop 1. Later, in 2012, a third component was added known as YARN (Yet Another Resource Negotiator) which decoupled the process of resource management and job scheduling. Before we delve into the core components in more detail, it would help to get an understanding of the fundamental premises of Hadoop:

Doug Cutting's post at https://issues.apache.org/jira/browse/NUTCH-193 announced his intent to separate Nutch Distributed FS (NDFS) and MapReduce to a new subproject called Hadoop.

主站蜘蛛池模板: 卓资县| 北票市| 泾川县| 佳木斯市| 怀柔区| 雷州市| 花莲县| 平远县| 阿拉尔市| 西安市| 东辽县| 张家界市| 葵青区| 响水县| 班戈县| 怀柔区| 滦平县| 寻乌县| 黄冈市| 沙坪坝区| 西宁市| 彭阳县| 石泉县| 天柱县| 陇南市| 清涧县| 井研县| 普兰店市| 枣庄市| 赣榆县| 饶阳县| 会泽县| 荆州市| 锡林浩特市| 扬中市| 浪卡子县| 濮阳市| 交口县| 左贡县| 庐江县| 瓦房店市|