官术网_书友最值得收藏!

Summary

We have covered a lot of ground in this chapter and we now have the foundation to explore MapReduce in more detail. Specifically, we learned how key/value pairs is a broadly applicable data model that is well suited to MapReduce processing. We also learned how to write mapper and reducer implementations using the 0.20 and above versions of the Java API.

We then moved on and saw how a MapReduce job is processed and how the map and reduce methods are tied together by significant coordination and task-scheduling machinery. We also saw how certain MapReduce jobs require specialization in the form of a custom partitioner or combiner.

We also learned how Hadoop reads data to and from the filesystem. It uses the concept of InputFormat and OutputFormat to handle the file as a whole and RecordReader and RecordWriter to translate the format to and from key/value pairs.

With this knowledge, we will now move on to a case study in the next chapter, which demonstrates the ongoing development and enhancement of a MapReduce application that processes a large data set.

主站蜘蛛池模板: 寿阳县| 抚松县| 武山县| 宜章县| 永济市| 名山县| 永城市| 嵊州市| 泾源县| 南雄市| 贡山| 福贡县| 崇明县| 布尔津县| 邮箱| 沂南县| 炎陵县| 卢龙县| 共和县| 大田县| 蓬溪县| 泸西县| 九龙城区| 姜堰市| 浦东新区| 慈溪市| 阜新市| 濮阳市| 桂东县| 宁陵县| 利川市| 株洲县| 渑池县| 宝应县| 佳木斯市| 乐平市| 喀喇沁旗| 武陟县| 北宁市| 平谷区| 襄汾县|