Why is block size set to 128 MB in Hadoop HDFS?

Question

Why is block size set to 128 MB in Hadoop HDFS?

1 Answer

rahuljain1 · Answer 1 · 2020-11-07T06:46:58+0000

Block is a continuous location on the hard drive which stores the data. In general, FileSystem store data as a collection of blocks. HDFS stores each file as blocks, and distributes it across the Hadoop cluster. In HDFS, the default size of data block is 128 MB, which we can configure as per our requirement. Block size is set to 128 MB:

To reduce the disk seeks (IO). Larger the block size, lesser the file blocks and less number of disk seek and transfer of the block can be done within respectable limits and that to parallelly.

HDFS have huge data sets, i.e. terabytes and petabytes of data. If we take 4 KB block size for HDFS, just like Linux file system, which have 4 KB block size, then we would be having too many blocks and therefore too much of metadata. Managing this huge number of blocks and metadata will create huge overhead and traffic which is something which we don’t want. So, the block size is set to 128 MB.

On the other hand, block size can’t be so large that the system is waiting a very long time for the last unit of data processing to finish its work.

🔗Source: Hadoop Interview Questions and Answers

Why is block size set to 128 MB in Hadoop HDFS?

Please log in or register to answer this question.

1 Answer

Related questions

Top Trending Technologies Questions and Answers

HOT LINKS

TRANDING TECHNOLOGIES

CONTACT US

Follow us on Social Media