Hadoop is an open-source framework that supports big data technology.
It helps to store, process and handle large datasets. The important components of Hadoop are:
- Hadoop Distributed File System (HDFS)
- Yet Another Resource Negotiator (YARN)
- Common
HDFS
- It is a virtual File system.
- It is primary storage in Hadoop.
- Infinitely scalable.
Yarn
- Responsible for providing computational resources.
- Comprises Resource Manager, Node Manager, Application Manager.
Hadoop Common
- It is a collection of libraries that implement underlying capabilities lacked by Hadoop.
Note: When Hive wants to access HDFS, it needs to make java jar files that are stored in COMMON.