- Hadoop Real-World Solutions Cookbook(Second Edition)
- Tanmay Deshpande
- 314字
- 2021-07-09 20:02:50
Changing the replication factor of an existing file in HDFS
In this recipe, we are going to take a look at how to change the replication factor of a file in HDFS. The default replication factor is 3.
Getting ready
To perform this recipe, you should already have a running Hadoop cluster.
How to do it...
Sometimes. there might be a need to increase or decrease the replication factor of a specific file in HDFS. In this case, we'll use the setrep
command.
This is how you can use the command:
hadoop fs -setrep [-R] [-w] <noOfReplicas><path> ...
In this command, a path can either be a file or directory; if its a directory, then it recursively sets the replication factor for all replicas.
- The
w
option flags the command and should wait until the replication is complete - The
r
option is accepted for backward compatibility
First, let's check the replication factor of the file we copied to HDFS in the previous recipe:
hadoop fs -ls /mydir1/LICENSE.txt -rw-r--r-- 3 ubuntu supergroup 15429 2015-10-29 03:04 /mydir1/LICENSE.txt
Once you list the file, it will show you the read/write permissions on this file, and the very next parameter is the replication factor. We have the replication factor set to 3 for our cluster, hence, you the number is 3.
Let's change it to 2
using this command:
hadoop fs -setrep -w 2 /mydir1/LICENSE.txt
It will wait till the replication is adjusted. Once done, you can verify this again by running the ls command:
hadoop fs -ls /mydir1/LICENSE.txt -rw-r--r-- 2 ubuntu supergroup 15429 2015-10-29 03:04 /mydir1/LICENSE.txt
How it works...
Once the setrep
command is executed, NameNode
will be notified, and then NameNode
decides whether the replicas need to be increased or decreased from certain DataNode
. When you are using the –w
command, sometimes, this process may take too long if the file size is too big.
- LabVIEW虛擬儀器從入門到測控應用130例
- 計算機圖形學
- Hands-On Neural Networks with Keras
- 流處理器研究與設計
- VMware Performance and Capacity Management(Second Edition)
- Multimedia Programming with Pure Data
- 新手學電腦快速入門
- 基于ARM 32位高速嵌入式微控制器
- AutoCAD 2012中文版繪圖設計高手速成
- Linux服務與安全管理
- SAP Business Intelligence Quick Start Guide
- 電腦上網輕松入門
- INSTANT Adobe Story Starter
- 電動汽車驅動與控制技術
- EJB JPA數據庫持久層開發實踐詳解