mg不朽情缘怎么玩

書名： Hadoop 2.x Administration Cookbook
作者名： Gurmukh Singh
本章字數： 505字
更新時間： 2021-07-09 20:10:29

Configuring rack awareness

There will always be failures in clusters, such as hardware issues with servers, racks, switches, power supplies, and so on.

To make sure that there is no single point of failure across the entire Hadoop infrastructure, and to ensure that the contention of resources is in a distributed manner, rack awareness plays an important role. Rack awareness is a concept in which Namenode is made aware of the layout of servers in a cluster, thus making intelligent decisions on block placement.

Getting ready

For the following steps, we assume that the cluster that is up and running with Datanodes is in a healthy state. We will log in to the Namenode and make changes there.

How to do it...

ssh to Namenode and edit the hdfs-site.xml file to add the following property to it:

<property>
<name>topology.script.file.name</name>
<value>/opt/cluster/topology.sh</value>
</property>

Make sure that the topology.sh file is readable by the user hadoop.
Create two files, topology.sh and topology.data, and add the contents as shown in the following screenshot:

Restart the namenode daemon for the property to take effect:

$ hadoop-daemons.sh stop namenode
$ hadoop-daemons.sh start namenode

Once the changes are made, the user will start seeing the rack field in the output of the hdfs dfsadmin –dfsreport command, as shown in the following screenshot:

We can have multiple levels in the topology by specifying topology.data:

$ cat topology.data
10.0.0.37 /sw1/rack1
10.0.0.38 /sw1/rack2
10.0.0.39 /sw2/rack3

sw1 and sw2 are rack switches, so the failure of sw1 will cause the outage of rack1 and rack2. Namenode will make sure that all the copies of a block are not placed across rack1 and rack2:
```
$ hadoop dfsadmin -refreshNodes
```

How it works...

Let's have a look at what we did throughout this recipe.

In steps 1 through 3 we added the new property to the hdfs-site.xml file and then restarted Namenode to make it aware of the changes. Once the property is in place, the Namenode becomes aware of the topology.sh file and it will execute it to find the layout of the Datanodes in the cluster.

When the Datanodes register with Namenode, the Namenode verifies their IP or hostname and places it in a rack map accordingly. This is dynamic in nature and is never persisted to disk.

In Hadoop 2, there are multiple classes, such as simpleDNS and table-based, that can be used to perform a resolution of hosts in the rack awareness algorithm. The user can use any scripting language or Java to configure this. We do not need to do anything if we are using a script as shown in the preceding method, but for Java invocations and other tabular formats, we need to modify the topology.node.switch.mapping.impl.

To troubleshoot this, there are some common things to look at, such as the file permissions and the path to the file. We will be able to see this if we check the Namenode logs.

官术网_书友最值得收藏!

Configuring rack awareness

Getting ready

How to do it...

How it works...

See also

官术网_书友最值得收藏!

Hadoop 2.x Administration Cookbook

Configuring rack awareness

Getting ready

How to do it...

How it works...

See also