官术网_书友最值得收藏!

What this book covers

This book comprises of three main parts: chapters 1 through 5, which cover the core of Hadoop and how it works, chapters 6 and 7, which cover the more operational aspects of Hadoop, and chapters 8 through 11, which look at the use of Hadoop alongside other products and technologies.

Chapter 1, What It's All About, gives an overview of the trends that have made Hadoop and cloud computing such important technologies today.

Chapter 2, Getting Hadoop Up and Running, walks you through the initial setup of a local Hadoop cluster and the running of some demo jobs. For comparison, the same work is also executed on the hosted Hadoop Amazon service.

Chapter 3, Understanding MapReduce, goes inside the workings of Hadoop to show how MapReduce jobs are executed and shows how to write applications using the Java API.

Chapter 4, Developing MapReduce Programs, takes a case study of a moderately sized data set to demonstrate techniques to help when deciding how to approach the processing and analysis of a new data source.

Chapter 5, Advanced MapReduce Techniques, looks at a few more sophisticated ways of applying MapReduce to problems that don't necessarily seem immediately applicable to the Hadoop processing model.

Chapter 6, When Things Break, examines Hadoop's much-vaunted high availability and fault tolerance in some detail and sees just how good it is by intentionally causing havoc through killing processes and intentionally using corrupt data.

Chapter 7, Keeping Things Running, takes a more operational view of Hadoop and will be of most use for those who need to administer a Hadoop cluster. Along with demonstrating some best practice, it describes how to prepare for the worst operational disasters so you can sleep at night.

Chapter 8, A Relational View On Data With Hive, introduces Apache Hive, which allows Hadoop data to be queried with a SQL-like syntax.

Chapter 9, Working With Relational Databases, explores how Hadoop can be integrated with existing databases, and in particular, how to move data from one to the other.

Chapter 10, Data Collection with Flume, shows how Apache Flume can be used to gather data from multiple sources and deliver it to destinations such as Hadoop.

Chapter 11, Where To Go Next, wraps up the book with an overview of the broader Hadoop ecosystem, highlighting other products and technologies of potential interest. In addition, it gives some ideas on how to get involved with the Hadoop community and to get help.

主站蜘蛛池模板: 东至县| 罗定市| 曲松县| 铜梁县| 乌苏市| 南昌县| 金山区| 长垣县| 潼关县| 德惠市| 民县| 惠来县| 皮山县| 津南区| 鹤岗市| 泊头市| 滨州市| 常德市| 天等县| 桑植县| 宁波市| 云霄县| 余姚市| 青浦区| 香港| 梨树县| 邵武市| 高清| 双流县| 嘉善县| 弋阳县| 沙田区| 汝城县| 台南市| 许昌市| 高邑县| 南城县| 黑龙江省| 台湾省| 桐乡市| 志丹县|