What is the difference between Apache Spark and Apache Hadoop MapReduce?

Question

What is the difference between Apache Spark and Apache Hadoop MapReduce?

1 Answer

SakshiSharma · Answer 1 · 2020-01-13T09:43:05+0000

Some of the main differences between Apache Spark and Hadoop MapReduce are follows:

1. Speed: Apache Spark is 10X to 100X faster than Hadoop due to its usage of in memory processing.

2. Memory: Apache Spark stores data in memory, whereas Hadoop MapReduce stores data in hard disk.

3. RDD: Spark uses Resilient Distributed Dataset (RDD) that guarantee fault tolerance. Where Apache Hadoop uses replication of data in multiple copies to achieve fault tolerance.

4. Streaming: Apache Spark supports Streaming with very less administration. This

makes it much easier to use than Hadoop for real-time stream processing.

5. API: Spark provides a versatile API that can be used with multiple data sources as well as languages. It is more extensible than the API provided by Apache Hadoop.

What is the difference between Apache Spark and Apache Hadoop MapReduce?

Please log in or register to answer this question.

1 Answer