CS246-F20-01-UnixShell
Lecture 1.9
• More UNIX commands:
– diff/cmp
CS246
diff / cmp
• Compare 2 files looking for differences
– Usually, we assume text files, and usually they are pretty similar
• Often, one is a newer version of the other, and you want to know “what
has changed”
• Often, used for source code to track changes in a project across many
developers
• Also use to check results of regression testing
– There are many options and tools for this, but the simplest are diff
and cmp
• There are also really clever tools to help merge the files into a coherent
whole
• cmp generates the first difference between
the files
• newline is counted, so there are 2 chars per
line in files
• I’ve never used cmp
cmp
File x File y
1 a\n a\n
2 b\n b\n
3 c\n c\n
4 d\n e\n
5 g\n h\n
6 h\n i\n
7 g\n
$ cmp x y
x y differ: char 7, line 4
• diff generates output describing how to change
first file into second file (using patch)
diff
File x File y
1 a\n a\n
2 b\n b\n
3 c\n c\n
4 d\n e\n
5 g\n h\n
6 h\n i\n
7 g\n
$ diff x y
4,5c4 # replace lines 4+5 of 1st file
< d # with line 4 of 2nd file
< g
---
> e
6a6,7 # after line 6 of 1st file
> i # add lines 6+7 of 2nd file
> g
• I use diff all the time! Try this:
– Take a file with non-trivial content, maybe source code
– Make a copy & make a few insertions, deletions, changes to the text
– Then in a really wide terminal window, do this:
$ sdiff file1 file2 | less
$ vim –d file1 file2
• Looks really nice if you use colour syntax highlighting on
source code
– If you don’t have an sdiff installed on your system, try: “diff –y”
or “diff –-side-by-side”
diff-ing
Regression testing (in one slide)
• In professional sw development, you typically have voluminous test suites
to run against your system to make sure it “does the right things”
– Testing is a HUGE part of the daily development effort
• As you make changes to the source code base, you re-run the old tests, to
make sure you didn’t break anything (“regress” to wrong behaviour)
– So you create text files of what you expect the output to be for the tests, and
then compare them to the actual output files from the new version of the
system
– Typically, you are interesting only in when there are differences, i.e., the
output behaviour is different; so diff is pretty useful here
• This is called regression testing
– You test that the system doesn’t regress to old, incorrect behaviour after new
changes by creating an explicit test: “A bug is a test case you forgot to write.”
– You will hear more about this topic in later courses too
End
CS246