0 votes
in HDFS by
What is a rack awareness algorithm and why is it used in Hadoop?

1 Answer

0 votes
by

Rack Awareness algorithm in Hadoop ensures that all the block replicas are not stored on the same rack or a single rack. Considering the replication factor is 3, the Rack Awareness Algorithm says that the first replica of a block will be stored on a local rack and the next two replicas will be stored on a different (remote) rack but, on a different DataNode within that (remote) rack. There are two reasons for using Rack Awareness:

To improve the network performance: In general, you will find greater network bandwidth between machines in the same rack than the machines residing in different rack. So, the Rack Awareness helps to reduce write traffic in between different racks and thus provides a better write performance. 

To prevent loss of data: I don’t have to worry about the data even if an entire rack fails because of the switch failure or power failure. And if one thinks about it, it will make sense, as it is said that never put all your eggs in the same basket.

Related questions

0 votes
0 votes
asked Feb 23, 2020 in Big Data | Hadoop by rahuljain1
0 votes
0 votes
asked Jun 22, 2023 in HDFS by rajeshsharma
0 votes
asked Jun 20, 2023 in HDFS by Robin
...