0 votes
in Apache Spark by
What is the best way to minimize data transfers when working with Spark?

1 Answer

0 votes
by

To write a fast and reliable Spark program, we have to minimize data transfers and avoid shuffling. There are various ways to minimize data transfers while working with Apache Spark. These are:

  • Using Broadcast Variable- Broadcast variables enhance the efficiency of joins between small and large RDDs.
  • Using Accumulators- Accumulators are used to updating the values of variables in parallel while executing.
...