We can create RDD in Spark in following two ways:
1. Internal: We can parallelize an existing collection of data within our Spark Driver program and create a RDD out of it.
2. External: We can also create RDD by referencing a Dataset in an external data
source like AWS S3, HDFS, HBASE etc.