官术网_书友最值得收藏!

Preface

Apache Hadoop is an open source distributed computing technology that assists users in processing large volumes of data with relative ease, helping them to generate tremendous insights into their data. Cloudera, with their open source distribution of Hadoop, has made data analytics on Big Data possible and accessible to anyone interested.

This book fully prepares you to be a Hadoop administrator, with special emphasis on Cloudera. It provides step-by-step instructions on setting up and managing a robust Hadoop cluster running Cloudera's Distribution Including Apache Hadoop (CDH).

This book starts out by giving you a brief introduction to Apache Hadoop and Cloudera. You will then move on to learn about all the tools and techniques needed to set up and manage a production-standard Hadoop cluster using CDH and Cloudera Manager.

In this book, you will learn the Hadoop architecture by understanding the different features of HDFS and walking through the entire flow of a MapReduce process. With this understanding, you will start exploring the different applications packaged into CDH and will follow a step-by-step guide to set up HDFS High Availability (HA) and HDFS Federation.

You will learn to use Cloudera Manager, Cloudera's cluster management application. Using Cloudera Manager, you will walk through the steps to configure security using Kerberos, learn about events and alerts, and also configure backups.

主站蜘蛛池模板: 民丰县| 开封县| 利津县| 鸡东县| 蒙阴县| 淅川县| 嘉善县| 锡林郭勒盟| 石家庄市| 垦利县| 南开区| 贵南县| 潼南县| 新源县| 孟村| 尤溪县| 饶河县| 重庆市| 阳泉市| 乌海市| 房山区| 武义县| 桂平市| 正定县| 尉氏县| 邹平县| 德格县| 乐都县| 健康| 平原县| 平阳县| 新乡县| 湘乡市| 千阳县| 塔河县| 左贡县| 广南县| 海丰县| 尼玛县| 扶沟县| 玉山县|