Comparing two directories
Sometimes it's useful to check whether two directories contain the same files, and whether the files that match are exactly the same.
Assume you have two directories: dir1 and dir2
First, compare the listing of files in the dirs:
find dir1 |sort > contents_dir1.txt
find dir2 |sort > contents_dir2.txt
(I'm sure there are ways to exclude certain patterns, but I can't be bothered because I don't need it at this point.)
Next, compare the two listings using diff ( piping it to colordiff gives pretty colours, using less enables you to actually see what is different if the list is very long).
diff -u contents_dir1.txt contents_dir2.txt |colordiff |less
If you see files here for which you don't care about whether they are identical, remove them. Sed is your friend if there are many of them.
sed -i '/<pattern>/d' contents_dir1.txt
sed -i '/<pattern>/d' contents_dir2.txt
Now make md5sums of all files listed in contents_dir1.txt and contents_dir2.txt. Redirecting stderr to /dev/null suppresses "<blabla> is a directory" warnings from the md5sum command
cat contents_dir1.txt |xargs md5sum > md5sums_dir1.txt 2> /dev/null
cat contents_dir2.txt |xargs md5sum > md5sums_dir2.txt 2> /dev/null
And once again, use diff to compare the md5sums. From here, it's up to you to decide what to do with files that are different :-)
diff -u md5sums_dir1.txt md5sums_dir2.txt |colordiff |less
