What is fsck?
fsck is the File System Check. Hadoop HDFS use the fsck (filesystem check) command to check for various inconsistencies. It also reports the problems with the files in HDFS. For example, missing blocks for a file or under-replicated blocks. It is different from the traditional fsck utility for the native file system. Therefore it does not correct the errors it detects.
Normally NameNode automatically corrects most of the recoverable failures. Filesystem check also ignores open files. But it provides an option to select all files during reporting. The HDFS fsck command is not a Hadoop shell command. It can also run as bin/hdfs fsck. Filesystem check can run on the whole file system or on a subset of files.
Usage:
hdfs fsck <path>
[-list-corruptfileblocks |
[-move | -delete | -openforwrite]
[-files [-blocks [-locations | -racks]]]
[-includeSnapshots]
Path- Start checking from this path
-delete- Delete corrupted files.
-files- Print out the checked files.
-files –blocks- Print out the block report.
-files –blocks –locations- Print out locations for every block.
-files –blocks –rack- Print out network topology for data-node locations
-includeSnapshots- Include snapshot data if the given path indicates or include snapshottable directory.
-list -corruptfileblocks- Print the list of missing files and blocks they belong to.