Answer: There are two methods to create RDD:
They are:
Parallelizing driver program collection and this can be used SparkContext’s ‘parallelize’
method val Array = Array(3,6,9)
val RDD = sc.parallelize(Array)
By loading dataset from the HBase, HDFS and other external storages.