1 Answer

0 votes
by

HDFS works with commodity hardware (systems with average configurations) that has high chances of getting crashed any time. Thus, to make the entire system highly fault-tolerant, HDFS replicates and stores data in different places. Any data on HDFS gets stored at least 3 different locations. So, even if one of them is corrupted and the other is unavailable for some time for any reason, then data can be accessed from the third one. Hence, there is no chance of losing the data. This replication factor helps us to attain the feature of Hadoop called Fault Tolerant.

Since the data is replicated thrice in HDFS, does it mean that any calculation done on one node will also be replicated on the other two?

No, calculations will be done only on the original data. The master node will know which node exactly has that particular data. In case, if one of the nodes is not responding, it is assumed to be failed. Only then, the required calculation will be done on the second replica.

Click here to read more about Loan/Mortgage
Click here to read more about Insurance

Related questions

0 votes
asked Jan 26, 2020 in Big Data | Hadoop by rajeshsharma
0 votes
asked Jan 11, 2020 in Big Data | Hadoop by rajeshsharma
0 votes
asked Jun 8, 2020 in HDFS by Robindeniel
0 votes
asked Apr 1, 2020 in Big Data | Hadoop by AdilsonLima
0 votes
asked Jan 12, 2020 in Big Data | Hadoop by sharadyadav1986
...