0 votes
in HDFS by
How will you compare two HDFS files?

1 Answer

0 votes
by

UNIX has a diff command to compare two HDFS files but there is no diff command with Hadoop. However, redirections can be used in the shell with the diff command as follows-

diff < (hadoop fs -cat /path/to/file) < (hadoop fs -cat /path/to/file2)

If the goal is just to find whether the two files are similar or not without having to know the exact differences, then a checksum-based approach can also be followed to compare two files. Get the checksums for both files and compare them.

Related questions

0 votes
0 votes
asked Dec 21, 2022 in HDFS by Robin
0 votes
asked Nov 24, 2020 in HDFS by rahuljain1
...