in Hadoop by
Q:
What do you mean by shuffling and sorting in MapReduce?

1 Answer

0 votes
by

What do you mean by shuffling and sorting in MapReduce?

Shuffling and Sorting takes place after the completion of map task. Shuffle and sort phase in hadoop occurs simultaneously.

Shuffling- It is the process of transferring data from the mapper to reducer. i.e., the process by which the system sorts the key-value output of the map tasks and transfer it to the reducer.

So, shuffle phase is necessary for reducer, otherwise, they would not have any input. As shuffling can start even before the map phase has finished. So this saves some time and completes the task in lesser time.

Sorting- Mapper generate the intermediate key-value pair. Before starting of reducer, MapReduce framework sort these key-value pairs by the keys.

Sorting helps reducer to easily distinguish when a new reduce task should start. Thus saves time for the reducer.

Shuffling and sorting are not performed at all if you specify zero reducers (setNumReduceTasks(0)).

Click here to read more about Loan/Mortgage
Click here to read more about Insurance

Related questions

0 votes
asked Aug 28, 2020 in Service Now by RShastri
0 votes
asked Feb 22, 2020 in Big Data | Hadoop by SakshiSharma
0 votes
asked Dec 3, 2020 in Hadoop by sharadyadav1986
0 votes
asked Dec 30, 2020 in CodeIgniter by SakshiSharma
0 votes
asked Nov 27, 2020 in Sql by rajeshsharma
...