Skip to main content

Posts

Showing posts from November, 2013

Perl script for comparing files: List missing lines, regardless of order

The other day, I was comparing two different sitemap files of the same site. One had more links than the other, and I was trying to get a list of what was missing from the shorter one. However, since they were from different sitemap generators, the order of the links were completely different in each file. Surprisingly, this turned out to be a much bigger challenge than I thought. I figured I could use some variation of a grep command line, or diff, but I wasn't able to find a simple combination of command line options for either that would do what I was looking for. It seems like everything I found was more geared toward comparing files that were in the same order. Diff simply dumped a large list of all the lines in file2; since the order was different than file1, every line was considered a mismatch. Knowing this was a fairly trivial operation to do in Perl, I decide to write a quick script to do it. I'm sharing it here in case it can benefit anyone else: #!/u