官术网_书友最值得收藏!

Summary

We have covered a lot of ground in this chapter and we now have the foundation to explore MapReduce in more detail. Specifically, we learned how key/value pairs is a broadly applicable data model that is well suited to MapReduce processing. We also learned how to write mapper and reducer implementations using the 0.20 and above versions of the Java API.

We then moved on and saw how a MapReduce job is processed and how the map and reduce methods are tied together by significant coordination and task-scheduling machinery. We also saw how certain MapReduce jobs require specialization in the form of a custom partitioner or combiner.

We also learned how Hadoop reads data to and from the filesystem. It uses the concept of InputFormat and OutputFormat to handle the file as a whole and RecordReader and RecordWriter to translate the format to and from key/value pairs.

With this knowledge, we will now move on to a case study in the next chapter, which demonstrates the ongoing development and enhancement of a MapReduce application that processes a large data set.

主站蜘蛛池模板: 龙州县| 谢通门县| 甘孜县| 云阳县| 清苑县| 三门峡市| 西昌市| 安宁市| 静宁县| 博野县| 新竹县| 大冶市| 桃园县| 宜阳县| 达州市| 太和县| 冕宁县| 阿城市| 新郑市| 牙克石市| 河北区| 莎车县| 南安市| 紫阳县| 泉州市| 漳州市| 泰宁县| 陆川县| 永吉县| 江源县| 大厂| 阜城县| 会同县| 桃江县| 青海省| 康乐县| 安远县| 巴中市| 鲁甸县| 思茅市| 土默特左旗|