- Hadoop Real-World Solutions Cookbook(Second Edition)
- Tanmay Deshpande
- 306字
- 2021-07-09 20:02:51
Recycling deleted data from trash to HDFS
In this recipe, we are going to see how to recover deleted data from the trash to HDFS.
Getting ready
To perform this recipe, you should already have a running Hadoop cluster.
How to do it...
To recover accidently deleted data from HDFS, we first need to enable the trash folder, which is not enabled by default in HDFS. This can be achieved by adding the following property to core-site.xml
:
<property> <name>fs.trash.interval</name> <value>120</value> </property>
Then, restart the HDFS daemons:
/usr/local/hadoop/sbin/stop-dfs.sh /usr/local/hadoop/sbin/start-dfs.sh
This will set the deleted file retention to 120 minutes.
Now, let's try to delete a file from HDFS:
hadoop fs -rmr /LICENSE.txt 15/10/30 10:26:26 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 120 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://localhost:9000/LICENSE.txt' to trash at: hdfs://localhost:9000/user/ubuntu/.Trash/Current
We have 120 minutes to recover this file before it is permanently deleted from HDFS. To restore the file to its original location, we can execute the following commands.
First, let's confirm whether the file exists:
hadoop fs -ls /user/ubuntu/.Trash/Current Found 1 items -rw-r--r-- 1 ubuntu supergroup 15429 2015-10-30 10:26 /user/ubuntu/.Trash/Current/LICENSE.txt
Now, restore the deleted file or folder; it's better to use the distcp
command instead of copying each file one by one:
hadoop distcp
hdfs
//localhost:9000/user/ubuntu/.Trash/Current/LICENSE.txt hdfs://localhost:9000/
This will start a map reduce job to restore data from the trash to the original HDFS folder. Check the HDFS path; the deleted file should be back to its original form.
How it works...
Enabling trash enforces the file retention policy for a specified amount of time. So, when trash is enabled, HDFS does not execute any blocks deletions or movements immediately but only updates the metadata of the file and its location. This way, we can accidently stop deleting files from HDFS; make sure that trash is enabled before experimenting with this recipe.
- 高效能辦公必修課:Word圖文處理
- 網(wǎng)上沖浪
- R Machine Learning By Example
- 群體智能與數(shù)據(jù)挖掘
- 工業(yè)機(jī)器人現(xiàn)場編程(FANUC)
- JBoss ESB Beginner’s Guide
- OpenStack Cloud Computing Cookbook(Second Edition)
- Blender Compositing and Post Processing
- 電氣控制與PLC技術(shù)應(yīng)用
- Windows Server 2003系統(tǒng)安全管理
- 悟透AutoCAD 2009案例自學(xué)手冊
- 計算機(jī)與信息技術(shù)基礎(chǔ)上機(jī)指導(dǎo)
- Dreamweaver CS6中文版多功能教材
- 奇點(diǎn)將至
- ZigBee無線通信技術(shù)應(yīng)用開發(fā)