官术网_书友最值得收藏!

Summary

We have covered a lot of ground in this chapter and we now have the foundation to explore MapReduce in more detail. Specifically, we learned how key/value pairs is a broadly applicable data model that is well suited to MapReduce processing. We also learned how to write mapper and reducer implementations using the 0.20 and above versions of the Java API.

We then moved on and saw how a MapReduce job is processed and how the map and reduce methods are tied together by significant coordination and task-scheduling machinery. We also saw how certain MapReduce jobs require specialization in the form of a custom partitioner or combiner.

We also learned how Hadoop reads data to and from the filesystem. It uses the concept of InputFormat and OutputFormat to handle the file as a whole and RecordReader and RecordWriter to translate the format to and from key/value pairs.

With this knowledge, we will now move on to a case study in the next chapter, which demonstrates the ongoing development and enhancement of a MapReduce application that processes a large data set.

主站蜘蛛池模板: 东丽区| 黑龙江省| 保靖县| 桦川县| 教育| 海城市| 大同县| 龙门县| 珠海市| 白城市| 宁陕县| 万宁市| 土默特左旗| 长岭县| 开远市| 庄河市| 遂溪县| 兴城市| 防城港市| 嘉荫县| 登封市| 乌拉特后旗| 长岛县| 诸城市| 蓝山县| 泰顺县| 大足县| 房产| 章丘市| 修文县| 武强县| 临海市| 哈尔滨市| 大竹县| 石棉县| 隆昌县| 黔江区| 沂南县| 淅川县| 若尔盖县| 黎城县|