Explain about partitions in Apache Spark in Spark?

Question

Explain about partitions in Apache Spark in Spark?

1 Answer

SakshiSharma · Answer 1 · 2020-03-14T09:01:03+0000

As the name indicates partition means a logical and smaller division of data which is similar to the ‘split’ in MapReduce. It is the process to derive data logical units to speed up the process. Spark can manage the partitions to minimize the network traffic between the executors and also it can read data from the nodes into RDD. Partitions can also be known as the data set in the large distributed chunk and it can be used to optimize the operations to hold chunks. Everything in the Spark can be performed through partition RDD.

Explain about partitions in Apache Spark in Spark?

Please log in or register to answer this question.

1 Answer