官术网_书友最值得收藏!

Planning and Setting Up Hadoop Clusters

In the last chapter, we looked at big data problems, the history of Hadoop, along with an overview of big data, Hadoop architecture, and commercial offerings. This chapter will focus on hands-on, practical knowledge of how to set up Hadoop in different configurations. Apache Hadoop can be set up in the following three different configurations:

  • Developer mode: Developer mode can be used to run programs in a standalone manner. This arrangement does not require any Hadoop process daemons, and jars can run directly. This mode is useful if developers wish to debug their code on MapReduce.
  • Pseudo cluster (single node Hadoop): A pseudo cluster is a single node cluster that has similar capabilities to that of a standard cluster; it is also used for the development and testing of programs before they are deployed on a production cluster. Pseudo clusters provide an independent environment for all developers for coding and testing. 
  • Cluster mode: This mode is the real Hadoop cluster where you will set up multiple nodes of Hadoop across your production environment. You should use it to solve all of your big data problems.

This chapter will focus on setting up a new Hadoop cluster. The standard cluster is the one used in the production, as well as the staging, environment. It can also be scaled down and used for development in many cases to ensure that programs can run across clusters, handle fail-over, and so on. In this chapter, we will cover the following topics:

  • Prerequisites for Hadoop
  • Running Hadoop in development mode
  • Setting up a pseudo Hadoop custer
  • Sizing the cluster

  • Setting up Hadoop in cluster mode
  • Diagnosing the Hadoop cluster
主站蜘蛛池模板: 隆尧县| 平罗县| 新津县| 高州市| 长宁区| 景德镇市| 当雄县| 夏河县| 桐乡市| 丽水市| 衡水市| 方山县| 内丘县| 弋阳县| 克山县| 清水河县| 襄垣县| 白朗县| 雷山县| 汝城县| 连城县| 恭城| 六盘水市| 茶陵县| 东丰县| 鄯善县| 大姚县| 桐梓县| 田东县| 建德市| 西和县| 七台河市| 盐源县| 康保县| 九江市| 江油市| 崇文区| 曲周县| 鄱阳县| 桐城市| 德州市|