0 votes
in Hadoop by
Illustrate the differences between HDFS block and InputSplit?

1 Answer

0 votes
by

HDFS block: An HDFS block is responsible for splitting data into some physical divisions.

Inputsplit: InputSplit in MapReduce is responsible for splitting the input files logically.

The InputSplit is also capable of controlling the number of mappers, however, the size of splits is user-defined. When it comes to HDFS, the HDFS block size is fixed to 64 MB, which tells that, for 1GB data, it will be 1GB/64MB = 16 splits/blocks. However, if input split size is not defined by the user, then it takes the default block size of HDFS.

...