官术网_书友最值得收藏!

  • Hadoop Beginner's Guide
  • Garry Turkington
  • 411字
  • 2021-07-29 16:51:32

What this book covers

This book comprises of three main parts: chapters 1 through 5, which cover the core of Hadoop and how it works, chapters 6 and 7, which cover the more operational aspects of Hadoop, and chapters 8 through 11, which look at the use of Hadoop alongside other products and technologies.

Chapter 1, What It's All About, gives an overview of the trends that have made Hadoop and cloud computing such important technologies today.

Chapter 2, Getting Hadoop Up and Running, walks you through the initial setup of a local Hadoop cluster and the running of some demo jobs. For comparison, the same work is also executed on the hosted Hadoop Amazon service.

Chapter 3, Understanding MapReduce, goes inside the workings of Hadoop to show how MapReduce jobs are executed and shows how to write applications using the Java API.

Chapter 4, Developing MapReduce Programs, takes a case study of a moderately sized data set to demonstrate techniques to help when deciding how to approach the processing and analysis of a new data source.

Chapter 5, Advanced MapReduce Techniques, looks at a few more sophisticated ways of applying MapReduce to problems that don't necessarily seem immediately applicable to the Hadoop processing model.

Chapter 6, When Things Break, examines Hadoop's much-vaunted high availability and fault tolerance in some detail and sees just how good it is by intentionally causing havoc through killing processes and intentionally using corrupt data.

Chapter 7, Keeping Things Running, takes a more operational view of Hadoop and will be of most use for those who need to administer a Hadoop cluster. Along with demonstrating some best practice, it describes how to prepare for the worst operational disasters so you can sleep at night.

Chapter 8, A Relational View On Data With Hive, introduces Apache Hive, which allows Hadoop data to be queried with a SQL-like syntax.

Chapter 9, Working With Relational Databases, explores how Hadoop can be integrated with existing databases, and in particular, how to move data from one to the other.

Chapter 10, Data Collection with Flume, shows how Apache Flume can be used to gather data from multiple sources and deliver it to destinations such as Hadoop.

Chapter 11, Where To Go Next, wraps up the book with an overview of the broader Hadoop ecosystem, highlighting other products and technologies of potential interest. In addition, it gives some ideas on how to get involved with the Hadoop community and to get help.

主站蜘蛛池模板: 华亭县| 云阳县| 屯昌县| 云阳县| 扶绥县| 耒阳市| 鸡泽县| 临汾市| 澎湖县| 申扎县| 四平市| 静安区| 苏尼特左旗| 通州市| 民乐县| 奈曼旗| 大安市| 荥经县| 通化县| 定结县| 梅州市| 尚义县| 清涧县| 阜康市| 柞水县| 虹口区| 洛宁县| 商南县| 巩义市| 临猗县| 丰镇市| 扎兰屯市| 公安县| 巴彦县| 承德市| 庆安县| 宁安市| 潢川县| 南川市| 泾阳县| 成安县|