in Big Data | Hadoop by
Q:
How does partitioning work in Hadoop?

1 Answer

0 votes
by

Partitioning is the phase between Map phase and Reduce phase in Hadoop workflow. Since partitioner gives output to Reducer, the number of partitions is same as the number of Reducers.

Partitioner will partition the output from Map phase into distinct partitions by using a user-defined condition.

Partitions can be like Hash based buckets.

 

E.g. If we have to find the student with the maximum marks in each gender in each subject. We can first use Map function to map the keys with each gender. Once mapping is done, the result is passed to Partitioner. Partitioner will partition each row with gender on the basis of subject. For each subject there will be a different Reducer. Reducer will take input from each partition and find the student with the highest marks.

Click here to read more about Loan/Mortgage
Click here to read more about Insurance

Related questions

0 votes
asked Jan 12, 2020 in Big Data | Hadoop by sharadyadav1986
0 votes
asked Jan 7, 2020 in Big Data | Hadoop by sharadyadav1986
...