+1 vote
in Hadoop by
How many Reducers run for a MapReduce job in Hadoop?

1 Answer

0 votes
by

How many Reducers run for a MapReduce job in Hadoop?

Reducer takes a set of an intermediate key-value pair produced by the mapper as the input. Then runs a reduce function on each of them to generate the output. Thus, the output of the reducer is the final output, which it stores in HDFS. Usually, in the reducer, we do aggregation or summation sort of computation.

With the help of Job.setNumreduceTasks(int) the user set the number of reducers for the job. Hence the right number of reducers are set by the formula:

0.95 Or 1.75 multiplied by (<no. of nodes> * <no. of the maximum container per node>).

With 0.95, all the reducers can launch immediately and start transferring map outputs as the map finish.

With 1.75, faster node finishes the first round of reduces and then launch the second wave of reduces.

By increasing the number of reducers:

Framework overhead increases

Increases load balancing

Lowers the cost of failures

Related questions

0 votes
asked Jun 23, 2023 in HDFS by rajeshsharma
0 votes
asked Jun 18, 2023 in HDFS by Robindeniel
...