Jan 11 in Big Data | Hadoop
Q: What is the meaning of term Data Locality in Hadoop?

1 Answer

Jan 11

In a Big Data system, the size of data is huge. So it does not make sense to move data across the network. In such a scenario, Hadoop tries to move computation closer to data.

So the Data remains local to the location wherever it was stored. But the computation tasks will be moved to data nodes that hold the data locally.


Hadoop follows following rules for Data Locality optimization:

1. Hadoop first tries to schedule the task on node that has an HDFS file on a local disk.

2. If it cannot be done, then Hadoop will try to schedule the task on a node on the same rack as the node that has data.

3. If this also cannot be done, Hadoop will schedule the task on the node with same data on a different rack.

The above method works well, when we work with the default replication factor


of 3 in Hadoop.

Click here to read more about Loan/Mortgage
Click here to read more about Insurance

Related questions

Feb 5 in Big Data | Hadoop
Jan 26 in Big Data | Hadoop
Feb 23 in Big Data | Hadoop