Categories

Jan 13 in Big Data | Hadoop

Q: What is a Shuffle operation in Spark?

1 Answer

Jan 13
Shuffle operation is used in Spark to re-distribute data across multiple partitions.

It is a costly and complex operation.

In general a single task in Spark operates on elements in one partition. To execute shuffle, we have to run an operation on all elements of all partitions. It is also

 

called all-to-all operation.
Click here to read more about Loan/Mortgage
Click here to read more about Insurance

Related questions

Madanswer
Jan 13 in Big Data | Hadoop
Aug 26, 2019 in NoSQL - Database Revolution
Jan 13 in Big Data | Hadoop
...