+1 vote
in PySpark by
What do you know about PySpark serializers?

1 Answer

0 votes
by

In PySpark, serialization is a process that is used to conduct performance tuning on Spark. PySpark supports serializers because we have to continuously check the data sent or received over the network to the disk or memory. PySpark supports two types of serializers. They are as follows:

  1. PickleSerializer: This is used to serialize the objects using Python's PickleSerializer using class pyspark.PickleSerializer). This serializer supports almost every Python object.
  2. MarshalSerializer: The MarshalSerializer is used to perform serialization of objects. This can be used by using class pyspark.MarshalSerializer. This serializer is way faster than the PickleSerializer, but it supports only limited types.

Related questions

0 votes
asked Mar 13, 2022 in PySpark by rajeshsharma
+1 vote
asked Mar 13, 2022 in PySpark by rajeshsharma
...