UNIX has a diff command to compare two HDFS files but there is no diff command with Hadoop. However, redirections can be used in the shell with the diff command as follows-
diff < (hadoop fs -cat /path/to/file) < (hadoop fs -cat /path/to/file2)
If the goal is just to find whether the two files are similar or not without having to know the exact differences, then a checksum-based approach can also be followed to compare two files. Get the checksums for both files and compare them.