Unit testing: Why bother?

What is unit testing?

Unit testing is the practice of testing the components of a program automatically, using a test program to provide inputs to each component and check the outputs. The tests are usually written by the same programmers as the software being tested, either before or at the same time as the rest of the software.

Most unit tests are written using some sort of test framework, a set of library code designed to make writing and running tests easier. Nearly all programming languages have at least one commonly used test framework.[1] But you don't have to use a test framework to do unit testing. All you need is something that can run a bit of your code, feed it some inputs, and check the results.

Of course as well as writing tests, you have to remember to run them. Many projects combine running the tests with building the software. Larger development teams automate this using some kind of continuous integration system. For individual projects, it may be enough simply to remember to run the tests before rolling a release and after any significant code change.

An example

Here's a simple example. Let's say we have a function which finds the median of a sequence of integer values. In pseudo-code, a series of (four) unit tests for this function might look like this. (Assume that assert is a function which prints an error and exits if its argument is not true.)

    // The median of a sequence with a single value in it is that value
    assert(median([3]) == 3);

    // The median of an odd number of values is the middle value in order
    assert(median([3, 1, 2]) == 2);

    // What do we do for even sizes? In this case we want the function to
    // return the lower of the two middle values, since it's acting on integers
    // and must return something from the input domain (i.e. not a mean)
    assert(median([3, 1, 5, 9]) == 3);

    // And for an empty input sequence we want it to throw an exception
    try {
	assert(false); // should not get here
    } catch (...) {
        assert(true); // this is the expected result

As simple as this example is, it has forced us to make clear decisions about the behaviour we expect: what we expect the function to do about even-sized and empty inputs, in this case.

What's unit testing good for?

Unit testing has various applications.

Some developers practise test-driven development, a process in which the unit tests are written before the rest of the code. The tests thus describe a “contract” that the code is expected to comply with. This ensures that the code will be correct (as far as can be enforced by the testing contract) as written, and it provides a useful framework for thinking about how the code should be designed, what interfaces it should provide, and how its algorithms might work. This can be a very satisfying mental aid in developing tricky algorithms.

Less rigorously, unit tests written during software development provide early sanity-checking for code. A program to read data from a file and calculate a result might fail not only because the basic algorithm is wrong, but because the input is read wrongly or the code fails to deal with unusual cases like short or empty datasets. These things are very easy to test automatically, perhaps even by running tests from a command shell, and a sensible application of unit tests can rule out many possible sources of “silly” failures.

Unit tests are also vital to regression testing, the business of ensuring that new changes in code don't break things that were working before. Regression testing is obviously especially important in team working, but it is surprisingly easy to break your own code without noticing it, even if you are working on your own. And because regression testing is next to impossible to do satisfactorily by hand (it's simply too tedious), it's an obvious case for automation through unit tests.

Finally, unit tests provide documentation for other developers. Even if, like most of us, you are too lazy to document your code thoroughly, a small set of unit tests conveys a lot of useful information about how it is designed and how you expect it to work.

Is it worth doing?

Unit testing certainly appears to involve some extra work: unit tests don't write themselves. (I was quite disappointed when I first learned that a unit test framework didn't actually write the tests for you.)

We polled a few commercial and academic software developers, to ask whether they felt unit testing was worth the effort for them.

A developer working in a financial software house told us:

I don't think you could credibly say unit testing was a waste of time these days. It's one of the few “modern” ideas that really does help the coder. It's a kind of crutch because it reduces the amount of thinking you have to do—or at least codifies the thinking. I like it even more when you come to fiddle with something a few weeks or months later!

A developer for Google is sympathetic to the idea that unit testing takes work:

It definitely feels like an overhead—I probably spend something like 35% of my coding time writing "real" code, 50% writing code tests and 15% trying it "for real" in a test environment. But that testing effort probably speeds up my overall development, partly by finding errors earlier but also because the code is better designed and therefore more amenable to being extended.

The financial developer again:

This line that testing is extra work I would say is wrong—it's ultimately less work, perhaps at the expense of more lines of code. You won't have the time to refactor or redesign without unit tests, because you have to haul your thoughts back into working memory and think carefully. With unit tests you can spot small areas that can be improved (say share a common bit of code) and be sure that your minor change has not broken the grand scheme.

From a developer in academia:

Imagine someone else is basing something on your code in future. You don't want them accidentally breaking stuff because they're too stupid to see the consequences of apparently simple edits, right? You want to make sure the feedback loop between editing the code and seeing that it breaks something is as short, and as tight and reliable, as possible.

The theme that testable code is better designed is also a recurring one. From the Google developer:

I also think it leads to better code. It's far easier to test well-written modular code. Anything where you can separate functionality into smaller units and compose them is easier to test.

A related point raised during the breakout session on testing at this year's Collaborations Workshop is that code quality is directly relevant to scientific correctness. Researchers often overlook the distinction between verification and validation. Most scientific “evaluation” is validation: checking that your model is a good approximation to reality. We don't usually put as much effort into verification: checking that your program actually implements the model you think it does. Basic unit testing is a standard part of software verification and has a direct impact on scientific correctness.

Practical tips for unit tests

  • You can do unit testing without using a test framework, and this can be a good way to get started if the thought of learning a test framework seems too complicated. A framework saves time in the long run, and in a company context you'd always have one, but for your own use it doesn't have to be something you must learn before you can start. You can just run a bit of the code, check the results, spit out hideous abuse when the results are wrong. You can even do it from a shell script if your program is simple enough.
  • Tests should be small—don't think you have to use bulky real-world data. If your function gets the right median value for inputs with one value, a small odd number of values, and a small even number, it'll work for bigger inputs too. Think of tricky small cases, not easy large ones. Picking good test cases is something of an art and can be an interesting exercise in its own right.
  • Writing the tests first (i.e. practising test-driven development) can be a real help during development especially if you're not yet clear on how the code should actually work. When you find yourself getting stuck trying to visualise how an algorithm should work or how other code should interact with it, consider whether you can approach it from the other end by describing what its output should look like. Sketch it out in the tests, then write the code until the tests pass.
  • Whenever you find a bug in “finished code”, add a test for it. Make sure the test fails in the buggy code and passes when it is fixed. Areas of code you've found bugs in are more likely to be fragile in general, and bugs that have already been found are relatively highly likely to reappear.
  • When writing a new test, include something to make sure it is being run. For example, make it fail deliberately when you first write it. It's quite common to discover that the reason your tests are all passing is that they're not being run at all. (Overlooked in the build file, private instead of public, mistyped the method name: every testing framework has its set of common mistakes.) So, always do something to make sure your test is really working.
  • Don't ship code with tests that fail, even if it doesn't matter that they fail. It's not uncommon, particularly in test-driven development, to change your mind during design about which tests are correct or relevant, or to make an initial implementation that only covers some of the test suite. But that means you end up with failed tests that you don't actually care about. Remove them, or at very least, document them: anyone running your tests should be able to assume that a failed test indicates broken code.
  • Consider using a code coverage tool to check how much of your code is actually being tested. Coverage doesn't tell you everything: it only measures static execution paths, but it can give you some idea of things you might have missed altogether.

Chris Cannam

Further reading

[1]Common unit testing frameworks include JUnit for Java, Nose for Python, MTEST for MATLAB, and the Boost test framework for C++. See this list.