官术网_书友最值得收藏!

Decommissioning DataNodes

The Hadoop framework provides us with the option to remove certain nodes from the cluster if they are not needed any more. Here, we cannot simply shutdown the nodes that need to be removed as we might lose some of our data. They need to be decommissioned properly. In this recipe, we are going to learn how to decommission nodes from the Hadoop cluster.

Getting ready

To perform this recipe, you should have a Hadoop cluster, and you should have decided which node to decommission.

How to do it...

To decommission a node from the HDFS cluster, we need to perform the following steps:

  1. Create a dfs.exclude file in a folder, say /usr/local/hadoop/etc/hadoop, and add the hostname of the node you wish to decommission.
  2. Edit hdfs-site.xml on NameNode to append the following property:
        <property>
            <name>dfs.hosts.exclude</name>
            <value>/usr/local/hadoop/etc/hadoop/dfs.exclude</value>
        </property>
  3. Next, we need to execute the refreshNodes command so that it rereads the HDFS configuration in order to start the decommissioning:
    hdfs dfsadmin –refreshNodes
    

This will start the decommissioning, and once successful execution of the dfsadmin report command, you will see that the node's status is changed to Decommissioned from Normal:

hdfs dfsadmin –report
Name: 172.31.18.55:50010 (ip-172-31-18-55.us-west-2.compute.internal)
Hostname: ip-172-31-18-55.us-west-2.compute.internal
Decommission Status : Decommissioned
Configured Capacity: 8309932032 (7.74 GB)
DFS Used: 1179648 (1.13 MB)
Non DFS Used: 2371989504 (2.21 GB)
DFS Remaining: 5936762880 (5.53 GB)
DFS Used%: 0.01%
DFS Remaining%: 71.44%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Oct 08 10:56:49 UTC 2015

Generally, the decommissioning takes time as it requires block replications on other nodes. Once the decommissioning is complete, the node will be added to the decommissioned nodes list.

How it works...

HDFS/Namenode reads the configurations from hdfs-site.xml. You can configure a file with the list of nodes to decommission and execute the refreshNodes command; it then rereads the configuration file. While doing this, it reads the configuration about the decommissioned nodes and will start rereplicating blocks to other available datanode. Depending on the size of datanode getting decommissioned, the time varies. Unless the completed decommissioning is not completed, it advisable for you to touch datanode.

主站蜘蛛池模板: 舒城县| 千阳县| 越西县| 鄂托克前旗| 巴青县| 黑龙江省| 夏津县| 兰溪市| 江华| 安塞县| 孝昌县| 慈溪市| 沧州市| 乐至县| 栾川县| 花莲市| 张家港市| 保德县| 五河县| 望江县| 高唐县| 奉节县| 紫云| 周宁县| 台江县| 奈曼旗| 宜丰县| 柳江县| 寻甸| 临城县| 平乡县| 渭源县| 洪湖市| 百色市| 湘阴县| 宁远县| 阿拉善右旗| 广昌县| 呼和浩特市| 乌鲁木齐市| 当雄县|