Explain NameNode and DataNode in HDFS?
I. NameNode – It is also known as Master node. Namenode stores meta-data i.e. number of blocks, their location, replicas and other details. This meta-data is available in memory in the master for faster retrieval of data. NameNode maintains and manages the slave nodes, and assigns tasks to them. It should be deployed on reliable hardware as it is the centerpiece of HDFS.
Task of NameNode
Manage file system namespace.
Regulates client’s access to files.
In HDFS, NameNode also executes file system execution such as naming, closing, opening files and directories.
II. DataNode – It is also known as Slave. In Hadoop HDFS, DataNode is responsible for storing actual data in HDFS. DataNode performs read and write operation as per request for the clients. One can deploy the DataNode on commodity hardware.
Task of DataNode
In HDFS, DataNode performs various operations like block replica creation, deletion, and replication according to the instruction of NameNode.
DataNode manages data storage of the system.
Read: HDFS Blocks & Data Block Size