Jan 8 in Big Data | Hadoop

What is Mapper in Hadoop MapReduce?

1 Answer

Jan 8

Mapper in Hadoop takes each record generated by the RecordReader as input. Then processes each record and generates key-value pairs. This key-value pair is completely different from the input pair. The mapper output is known as intermediate output which is stored on the local disk. Mapper does not store its output on HDFS, as it is temporary data and storing on HDFS will create multiple copies.

Before storing mapper output on the local disk, partitioning of output takes place on the basis of the key and then sorting is done. This partitioning specifies that all the value for each key is grouped together. Mapper in hadoop only understands key-value pairs of data. So data should be converted into key-value pair before passing to the mapper. Data is converted into key-value pairs by InputSplit and RecordReader.

Click here to read more about Loan/Mortgage
Click here to read more about Insurance

Related questions

Feb 22 in Big Data | Hadoop
Jan 10 in Big Data | Hadoop
Mar 14 in Spark Sql