書名： Apache Hadoop 3 Quick Start Guide
作者名： Hrishikesh Vijay Karambelkar
本章字數(shù)： 268字
更新時間： 2021-06-10 19:18:45

High availability and fault tolerance

One of the major advantages of Hadoop is the high availability of a cluster. However, it also brings the additional burden of processing nodes based on requirements, thereby impacting sizing. The Data Replication Factor (DRF) of an HDFS node is directly proportional to the size of cluster; for example, if you have 200 GB of usable data, and you need a high replication of 5 (that means each data block will be replicated five times in the cluster), then you need to work out sizing for 200 GB x 5, which equals 1 TB. The default value of DRF in Hadoop is 3. A replication value of 3 works well because:

It offers ample avenues to recover from one of two copies, in the case of a corrupt third copy
Additionally, even if a second copy fails during the recovery period, you still have one copy of your data to recover

While determining the replication factor, you need to consider the following parameters:

The network reliability of your Hadoop cluster
The probability of failure of a node in a given network
The cost of increasing the replication factor by one
The number of nodes or VMs that will make up your cluster

If you are building a Hadoop cluster with three nodes, a replication factor of 4 does not make sense. Similarly, if a network is not reliable, the name node can access copy from a nearby available node. For systems with higher failure probabilities, the risk of losing data is higher, given that the probability of a second node increases.

官术网_书友最值得收藏!

Apache Hadoop 3 Quick Start Guide

High availability and fault tolerance