How is the splitting of file invoked in Hadoop framework?
Input file store data for Hadoop MapReduce task’s, and these files typically reside in HDFS. InputFormat defines how these input files split and read. It is also responsible for creating InputSplit, which is the logical representation of data. InputFormat also divides split into records. Then, mapper will process each record (which is a key-value pair). Hadoop framework invokes Splitting of the file by running getInputSplit() method. This method belongs to InputFormat class (like FileInputFormat) defined by the user.