- Hadoop Beginner's Guide
- Garry Turkington
- 411字
- 2021-07-29 16:51:32
What this book covers
This book comprises of three main parts: chapters 1 through 5, which cover the core of Hadoop and how it works, chapters 6 and 7, which cover the more operational aspects of Hadoop, and chapters 8 through 11, which look at the use of Hadoop alongside other products and technologies.
Chapter 1, What It's All About, gives an overview of the trends that have made Hadoop and cloud computing such important technologies today.
Chapter 2, Getting Hadoop Up and Running, walks you through the initial setup of a local Hadoop cluster and the running of some demo jobs. For comparison, the same work is also executed on the hosted Hadoop Amazon service.
Chapter 3, Understanding MapReduce, goes inside the workings of Hadoop to show how MapReduce jobs are executed and shows how to write applications using the Java API.
Chapter 4, Developing MapReduce Programs, takes a case study of a moderately sized data set to demonstrate techniques to help when deciding how to approach the processing and analysis of a new data source.
Chapter 5, Advanced MapReduce Techniques, looks at a few more sophisticated ways of applying MapReduce to problems that don't necessarily seem immediately applicable to the Hadoop processing model.
Chapter 6, When Things Break, examines Hadoop's much-vaunted high availability and fault tolerance in some detail and sees just how good it is by intentionally causing havoc through killing processes and intentionally using corrupt data.
Chapter 7, Keeping Things Running, takes a more operational view of Hadoop and will be of most use for those who need to administer a Hadoop cluster. Along with demonstrating some best practice, it describes how to prepare for the worst operational disasters so you can sleep at night.
Chapter 8, A Relational View On Data With Hive, introduces Apache Hive, which allows Hadoop data to be queried with a SQL-like syntax.
Chapter 9, Working With Relational Databases, explores how Hadoop can be integrated with existing databases, and in particular, how to move data from one to the other.
Chapter 10, Data Collection with Flume, shows how Apache Flume can be used to gather data from multiple sources and deliver it to destinations such as Hadoop.
Chapter 11, Where To Go Next, wraps up the book with an overview of the broader Hadoop ecosystem, highlighting other products and technologies of potential interest. In addition, it gives some ideas on how to get involved with the Hadoop community and to get help.
- 計(jì)算機(jī)應(yīng)用
- Getting Started with MariaDB
- 機(jī)器人智能運(yùn)動(dòng)規(guī)劃技術(shù)
- 城市道路交通主動(dòng)控制技術(shù)
- 基于ARM 32位高速嵌入式微控制器
- 大數(shù)據(jù)技術(shù)與應(yīng)用
- 軟件工程及實(shí)踐
- Hands-On Dashboard Development with QlikView
- Ansible 2 Cloud Automation Cookbook
- Java組件設(shè)計(jì)
- 樂高創(chuàng)意機(jī)器人教程(中級(jí) 上冊(cè) 10~16歲) (青少年iCAN+創(chuàng)新創(chuàng)意實(shí)踐指導(dǎo)叢書)
- 網(wǎng)絡(luò)信息安全項(xiàng)目教程
- 菜鳥起飛五筆打字高手
- ROS Robotics By Example(Second Edition)
- 工廠電氣控制設(shè)備