0 votes
in PySpark by
What do you understand by RDD Lineage?

1 Answer

0 votes
by

The RDD lineage is a procedure that is used to reconstruct the lost data partitions. The Spark does not hold up data replication in the memory. If any data is lost, we have to rebuild it using RDD lineage. This is the best use case as RDD always remembers how to construct from other datasets.

Related questions

0 votes
asked Mar 13, 2022 in PySpark by rajeshsharma
0 votes
asked Mar 13, 2022 in PySpark by rajeshsharma
...