官术网_书友最值得收藏!

Planning and Setting Up Hadoop Clusters

In the last chapter, we looked at big data problems, the history of Hadoop, along with an overview of big data, Hadoop architecture, and commercial offerings. This chapter will focus on hands-on, practical knowledge of how to set up Hadoop in different configurations. Apache Hadoop can be set up in the following three different configurations:

  • Developer mode: Developer mode can be used to run programs in a standalone manner. This arrangement does not require any Hadoop process daemons, and jars can run directly. This mode is useful if developers wish to debug their code on MapReduce.
  • Pseudo cluster (single node Hadoop): A pseudo cluster is a single node cluster that has similar capabilities to that of a standard cluster; it is also used for the development and testing of programs before they are deployed on a production cluster. Pseudo clusters provide an independent environment for all developers for coding and testing. 
  • Cluster mode: This mode is the real Hadoop cluster where you will set up multiple nodes of Hadoop across your production environment. You should use it to solve all of your big data problems.

This chapter will focus on setting up a new Hadoop cluster. The standard cluster is the one used in the production, as well as the staging, environment. It can also be scaled down and used for development in many cases to ensure that programs can run across clusters, handle fail-over, and so on. In this chapter, we will cover the following topics:

  • Prerequisites for Hadoop
  • Running Hadoop in development mode
  • Setting up a pseudo Hadoop custer
  • Sizing the cluster

  • Setting up Hadoop in cluster mode
  • Diagnosing the Hadoop cluster
主站蜘蛛池模板: 蕉岭县| 日喀则市| 望城县| 浦城县| 玛多县| 墨竹工卡县| 巧家县| 铁岭县| 贡觉县| 和硕县| 建阳市| 盐津县| 开鲁县| 濮阳县| 甘泉县| 延吉市| 于田县| 定远县| 宾阳县| 绿春县| 海兴县| 永嘉县| 江永县| 饶平县| 修水县| 武穴市| 普定县| 河北省| 时尚| 海城市| 南昌市| 句容市| 洞头县| 藁城市| 永善县| 达州市| 浦北县| 仙游县| 长子县| 陇川县| 曲水县|