官术网_书友最值得收藏!

Planning and Setting Up Hadoop Clusters

In the last chapter, we looked at big data problems, the history of Hadoop, along with an overview of big data, Hadoop architecture, and commercial offerings. This chapter will focus on hands-on, practical knowledge of how to set up Hadoop in different configurations. Apache Hadoop can be set up in the following three different configurations:

  • Developer mode: Developer mode can be used to run programs in a standalone manner. This arrangement does not require any Hadoop process daemons, and jars can run directly. This mode is useful if developers wish to debug their code on MapReduce.
  • Pseudo cluster (single node Hadoop): A pseudo cluster is a single node cluster that has similar capabilities to that of a standard cluster; it is also used for the development and testing of programs before they are deployed on a production cluster. Pseudo clusters provide an independent environment for all developers for coding and testing. 
  • Cluster mode: This mode is the real Hadoop cluster where you will set up multiple nodes of Hadoop across your production environment. You should use it to solve all of your big data problems.

This chapter will focus on setting up a new Hadoop cluster. The standard cluster is the one used in the production, as well as the staging, environment. It can also be scaled down and used for development in many cases to ensure that programs can run across clusters, handle fail-over, and so on. In this chapter, we will cover the following topics:

  • Prerequisites for Hadoop
  • Running Hadoop in development mode
  • Setting up a pseudo Hadoop custer
  • Sizing the cluster

  • Setting up Hadoop in cluster mode
  • Diagnosing the Hadoop cluster
主站蜘蛛池模板: 汾阳市| 诸城市| 湘潭市| 隆德县| 巴马| 酒泉市| 托里县| 丹巴县| 贵州省| 莒南县| 景谷| 汉中市| 揭阳市| 明水县| 方山县| 信宜市| 望都县| 罗平县| 敦化市| 新竹市| 上林县| 香港| 高平市| 墨竹工卡县| 连江县| 望奎县| 南雄市| 子洲县| 墨脱县| 平塘县| 江达县| 吴堡县| 永德县| 廊坊市| 新晃| 都兰县| 嘉黎县| 通辽市| 布尔津县| 东乡县| 定日县|