0 votes
in Hadoop by
What is Hadoop MapReduce?

1 Answer

0 votes
by

What is Hadoop MapReduce?

MapReduce is the data processing layer of Hadoop. It is the framework for writing applications that process the vast amount of data stored in the HDFS. It processes a huge amount of data in parallel by dividing the job into a set of independent tasks (sub-job). In Hadoop, MapReduce works by breaking the processing into phases: Map and Reduce.

Map- It is the first phase of processing. In which we specify all the complex logic/business rules/costly code. The map takes a set of data and converts it into another set of data. It also breaks individual elements into tuples (key-value pairs).

Reduce- It is the second phase of processing. In which we specify light-weight processing like aggregation/summation. Reduce takes the output from the map as input. After that, it combines tuples (key-value) based on the key. And then, modifies the value of the key accordingly.

Related questions

+1 vote
asked Nov 8, 2020 in Hadoop by rahuljain1
0 votes
asked Nov 8, 2020 in Hadoop by rahuljain1
...