in Big Data | Hadoop by (31.7k points)
What is a Shuffle operation in Spark?

1 Answer

0 votes
by (32.2k points)
Shuffle operation is used in Spark to re-distribute data across multiple partitions.

It is a costly and complex operation.

In general a single task in Spark operates on elements in one partition. To execute shuffle, we have to run an operation on all elements of all partitions. It is also

 

called all-to-all operation.

Related questions

0 votes
asked Jan 13, 2020 in Big Data | Hadoop by sharadyadav1986 (31.7k points)
0 votes
asked Jan 13, 2020 in Big Data | Hadoop by sharadyadav1986 (31.7k points)
+1 vote
asked Jan 29, 2022 in Big Data | Hadoop by sharadyadav1986 (31.7k points)
...