What is the significance of Resilient Distributed Datasets in Spark?

Question

What is the significance of Resilient Distributed Datasets in Spark?

1 Answer

sharadyadav1986 · Answer 1 · 2022-08-25T03:40:58+0000

Resilient Distributed Datasets are the fundamental data structure of Apache Spark. It is embedded in Spark Core. RDDs are immutable, fault-tolerant, distributed collections of objects that can be operated on in parallel.RDD’s are split into partitions and can be executed on different nodes of a cluster.

RDDs are created by either transformation of existing RDDs or by loading an external dataset from stable storage like HDFS or HBase.

Here is how the architecture of RDD looks like:

create-rdd

What is the significance of Resilient Distributed Datasets in Spark?

Please log in or register to answer this question.

1 Answer

Related questions

Top Trending Technologies Questions and Answers

HOT LINKS

TRANDING TECHNOLOGIES

CONTACT US

Follow us on Social Media