What is a Shuffle operation in Spark?

Question

What is a Shuffle operation in Spark?

1 Answer

SakshiSharma · Answer 1 · 2020-01-13T09:44:48+0000

Shuffle operation is used in Spark to re-distribute data across multiple partitions.

It is a costly and complex operation.

In general a single task in Spark operates on elements in one partition. To execute shuffle, we have to run an operation on all elements of all partitions. It is also

called all-to-all operation.

What is a Shuffle operation in Spark?

Please log in or register to answer this question.

1 Answer