官术网_书友最值得收藏!

Configuring HDFS replication

For redundancy, it is important to have multiple copies of data. In HDFS, this is achieved by placing copies of blocks on different nodes. By default, the replication factor is 3, which means that for each block written to HDFS, there will be three copies in total on the nodes in the cluster.

It is important to make sure that the cluster is working fine and the user can perform file operations on the cluster.

Getting ready

Log in to any of the nodes in the cluster. It is best to use the edge node, as stated in Chapter 1, and switch to the user hadoop.

Create a simple text file named file1.txt using any of your favorite text editors, and write some content in it.

How to do it...

  1. ssh to the Namenode, which in this case is nn1.cluster1.com, and switch to user hadoop.
  2. Navigate to the /opt/cluster/hadoop/etc/hadoop directory. This is the directory where we installed Hadoop in Chapter 1, Hadoop Architecture and Deployment. If the user has installed it at a different location, then navigate to this directory.
  3. Configure to the dfs.replication parameter in the directory hdfs-site.xml file.
  4. See the following screenshot for this configuration:
    How to do it...
  5. Once the changes are made, save the file and make changes across all nodes in the cluster.
  6. Restart the Namenode and Datanode daemons across the cluster. The easiest way of doing this is using the stop-dfs.sh and start-dfs.sh commands.
  7. See the following screenshot, which shows the way to restart the daemons:
    How to do it...

How it works...

The dfs.replication parameter is usually the same across the cluster, but it can be configured to be different across all nodes in the cluster. The source node from which the copy operation is done will define the replication factor for a file. For example, if an edge node has replication set to 2, then the blocks will be replicated twice, irrespective of the value on Namenode.

See also

  • The Configuring HDFS block size recipe
主站蜘蛛池模板: 旺苍县| 泽库县| 星子县| 诸暨市| 元朗区| 武宣县| 固始县| 惠安县| 栾城县| 宁明县| 临夏市| 晋宁县| 含山县| 苗栗市| 驻马店市| 玛纳斯县| 茌平县| 河东区| 芜湖县| 隆安县| 嵩明县| 平乡县| 芜湖市| 余干县| 广安市| 巴南区| 大埔县| 荣昌县| 海淀区| 高要市| 红原县| 东阳市| 昔阳县| 大英县| 同江市| 万盛区| 广东省| 涟源市| 郓城县| 桓台县| 东丽区|