0 votes
in Big Data | Hadoop by

Explain about the partitioning, shuffle and sort phase 

1 Answer

0 votes
by

Shuffle Phase-Once the first map tasks are completed, the nodes continue to perform several other map tasks and also exchange the intermediate outputs with the reducers as required. This process of moving the intermediate outputs of map tasks to the reducer is referred to as Shuffling.

 

Sort Phase- Hadoop MapReduce automatically sorts the set of intermediate keys on a single node before they are given as input to the reducer.

 

Partitioning Phase-The process that determines which intermediate keys and value will be received by each reducer instance is referred to as partitioning. The destination partition is same for any key irrespective of the mapper instance that generated it.

 

Related questions

0 votes
asked Apr 24, 2020 in Big Data | Hadoop by Hodge
0 votes
asked Sep 7, 2019 in Big Data | Hadoop by john ganales
...