Hadoop will make 5 splits as follows -
- 1 split for 64K files
- 2 splits for 65MB files
- 2 splits for 127MB files
Suppose Hadoop spawned 100 tasks for a job and one of the task failed. What will Hadoop do?
It will restart the task again on some other TaskTracker and only if the task fails more than four ( the default setting and can be changed) times will it kill the job.