Spark uses a coalesce method to reduce the number of partitions in a DataFrame.
Suppose you want to read data from a CSV file into an RDD having four partitions.
partition
This is how a filter operation is performed to remove all the multiple of 10 from the data.
The RDD has some empty partitions. It makes sense to reduce the number of partitions, which can be achieved by using coalesce.
This is how the resultant RDD would look like after applying to coalesce.