官术网_书友最值得收藏!

Setting the HDFS block size for all the files in a cluster

In this recipe, we are going to take a look at how to set a block size at the cluster level.

Getting ready

To perform this recipe, you should already have a running Hadoop cluster.

How to do it...

The HDFS block size is configurable for all files in the cluster or for a single file as well. To change the block size at the cluster level itself, we need to modify the hdfs-site.xml file.

By default, the HDFS block size is 128MB. In case we want to modify this, we need to update this property, as shown in the following code. This property changes the default block size to 64MB:

<property>
<name>dfs.block.size</name>
    <value>67108864</value>
    <description>HDFS Block size</description>
</property>

If you have a multi-node Hadoop cluster, you should update this file in the nodes, that is, NameNode and DataNode. Make sure you save these changes and restart the HDFS daemons:

/usr/local/hadoop/sbin/stop-dfs.sh
/usr/local/hadoop/sbin/start-dfs.sh

This will set the block size for files that will now get added to the HDFS cluster. Make sure that this does not change the block size of the files that are already present in HDFS. There is no way to change the block sizes of existing files.

How it works...

By default, the HDFS block size is 128MB for Hadoop 2.X. Sometimes, we may want to change this default block size for optimization purposes. When this configuration is successfully updated, all the new files will be saved into blocks of this size. Ensure that these changes do not affect the files that are already present in HDFS; their block size will be defined at the time being copied.

主站蜘蛛池模板: 如东县| 呼玛县| 开封县| 额敏县| 中牟县| 集贤县| 古浪县| 仁怀市| 望奎县| 津市市| 克什克腾旗| 莆田市| 宁城县| 榕江县| 汶上县| 榆中县| 台州市| 丽水市| 治县。| 新宾| 南昌县| 金堂县| 靖远县| 阿拉善盟| 舒城县| 东丰县| 瓦房店市| 伊春市| 北京市| 且末县| 新营市| 武山县| 巨鹿县| 凉山| 安多县| 武威市| 临海市| 吉木乃县| 嘉黎县| 绥阳县| 昆山市|