As the name indicates partition means a logical and smaller division of data which is similar to the ‘split’ in MapReduce. It is the process to derive data logical units to speed up the process. Spark can manage the partitions to minimize the network traffic between the executors and also it can read data from the nodes into RDD. Partitions can also be known as the data set in the large distributed chunk and it can be used to optimize the operations to hold chunks. Everything in the Spark can be performed through partition RDD.
Data Visualisation Aurora
Hadoop and Big Data
Agile and SAFE
New York, NY 10012, US
© Copyright 2018-2020 www.madanswer.com. All rights reserved. Developed by Madanswer.