What do you mean by the High Availability of a NameNode in Hadoop HDFS?
In Hadoop 1.0, NameNode is a single point of Failure (SPOF), if namenode fails, all clients including MapReduce jobs would be unable to read, write file or list files. In such event, whole Hadoop system would be out of service until new namenode is brought online.
Hadoop 2.0 overcomes this single point of failure by providing support for multiple NameNode. High availability feature provides an extra NameNode (active standby NameNode) to Hadoop architecture which is configured for automatic failover. If active NameNode fails, then Standby Namenode takes all the responsibility of active node and cluster continues to work.
The initial implementation of HDFS namenode high availability provided for single active namenode and single standby namenode. However, some deployment requires high degree fault-tolerance, this is enabled by new version 3.0, which allows the user to run multiple standby namenode. For instance configuring three namenode and five journal nodes, the cluster is able to tolerate the failure of two nodes rather than one.
🔗Source: Hadoop Interview Questions and Answers