0 votes
in Hadoop by
What do you know about MapReduce and how does it work?

1 Answer

0 votes
by

Hadoop MapReduce is a framework that is used to process the large data sets in parallel in a Hadoop cluster. Data analysis makes use of a two-step map and reduces the process. During the map phase in MapReduce, it will count the number of words in each document, while in the reduce phase it aggregates the data based on the document spanning the entire collection. During the map phase, the input data will be divided into splits for the purpose of analysis by map tasks that will be running in parallel across the Hadoop framework.

...