0 votes
in Hadoop by
Explain the architecture of HDFS.

1 Answer

0 votes
by

Explain the architecture of HDFS. 

The architecture of HDFS is as shown:

hdfs-architecture

For an HDFS service, we have a NameNode that has the master process running on one of the machines and DataNodes, which are the slave nodes.

NameNode

NameNode is the master service that hosts metadata in disk and RAM. It holds information about the various DataNodes, their location, the size of each block, etc. 

DataNode

DataNodes hold the actual data blocks and send block reports to the NameNode every 10 seconds. The DataNode stores and retrieves the blocks when the NameNode asks. It reads and writes the client’s request and performs block creation, deletion, and replication based on instructions from the NameNode.

Data that is written to HDFS is split into blocks, depending on its size. The blocks are randomly distributed across the nodes. With the auto-replication feature, these blocks are auto-replicated across multiple machines with the condition that no two identical blocks can sit on the same machine. 

As soon as the cluster comes up, the DataNodes start sending their heartbeats to the NameNodes every three seconds. The NameNode stores this information; in other words, it starts building metadata in RAM, which contains information about the DataNodes available in the beginning. This metadata is maintained in RAM, as well as in the disk.

Related questions

0 votes
asked Jun 22, 2023 in Hadoop by rajeshsharma
0 votes
0 votes
asked Jun 18, 2023 in Hadoop by Robindeniel
...