The Art of Testing

April 30th, 2012

In a few previous posts I talked about the importance of unit testing and Test-Driven Development (TDD).

TDD helps us to ensure we’ve covered everything in our system by tests. Sure thing if you never write more production code than to satisfy a failing test, you’ve always covered all your production code by tests.

But how do we write a good suite of tests? How do we ensure that our tests help us maintaining the system? And remember, in agile development maintenance starts after the first successful compilation.

Purpose of tests

To know how to create a good suite of tests, we first must understand our expectations of these tests. What is the primary reason we need a good suite of tests?

The primary reason is not to prove that our code works. There are many ways to prove this, including an army of QA staff. Sure, it’s unprofessional to not know if all your code works, but it’s not the primary reason for our suite of tests. Why do we need this suite of tests? What do we expect from it?

The primary reason for a suite of tests is to allows us to refactor! The suite of tests eliminates the fear of changing the code. With a good suite of tests we can refactor the code while having continuous feedback if we broke something.

Good tests

A good unit test is Fast, Independent, Repeatable, Self-Validating, and Timely. For details on this F.I.R.S.T. acronym, read Clean Code by Uncle Bob, Robert C. Martin. Seriously, if you haven’t read it yet, order a copy now and come back after to finish reading this post.

Tests that follow this acronym, support refactoring the production code quite well. But there’s more to create a sustainable suite of good tests. Not only is a unit test independent of other unit tests, it also validates something that no other test validates. In order words, a single feature of the system (under test) is unit tested once.

Furthermore, tests must be stable, meaning that things that things that commonly change should not be asserted. E.g. asserting text on a screen leads to fragile tests; whenever the text is changed, the test fails. It’s better not to assert this text, but stub it or only assert that some text is printed.

Design feedback

Tests not given feedback about the behavior of the system, they also provide feedback about its design. Let me clarify this with an example.

@Test public void shouldGetWeekOfYear() {
  int weekOfYear = Calendar.getInstance().get(Calendar.WEEK_OF_YEAR);
  assertThat(getWeekOfYear(), equalTo(weekOfYear));
}

As you can see the test determines the current week of the year, and then asserts the getWeekOfYear(..) function returning the same value. Most likely this function determines the week of the year pretty much the same way.

Quite often I see constructions like this; tests that repeat the production code to assert its behavior. Most of these are written after the fact, but either way this is a clear symptom of bad design.

Understand design feedback

To find a better way to test this function, we must understand the feedback from this test. We can hear it scream bad design at us, but what’s wrong with the design for this function? Isn’t the function too trivial to be burdened by bad design?

Remember the Single Responsibility Principle? It teaches us that each function in a system must do one thing, and one thing only. This function getWeekOfYear() in fact has two responsibilities; it determines the week of the year, of the current system date. Let’s see what the test would look like when we extract this second responsibility.

@Test public void shouldGetWeekOfYear() throws Exception {
  assertThat(getWeekOfYear(date(2012, APRIL, 30)), equalTo(18));
}

This is much better. We fixed the function’s design by extracting the extra responsibility from the function, and the test now is much more expressive. Note that I used a function date(..) to get a specific date, to improve expressiveness somewhat more.

Don’t fix the symptom

But wait a minute. It’s nice and all that the test now is more expressive, and that this specific function is left with only one responsibility. But now it became responsibility of the caller to determine the current date. How’s that better than having a low-level function to do two things?

Indeed. Our test helped finding a much more serious design problem: the concept of date/time is not properly separated in this particular system. Now that we’ve got this design flaw clear, we can fix it.

By the way, I’m not suggesting that you create your own date and time abstractions, but access to the current date/time is a separate concern that must not be spread throughout the system. We can introduce a Clock that gets the current date/time, and use this clock to pass the current date into the getWeekOfYear(..) function.

Conclusion

When you have trouble creating the scenario for a function to be tested, listen to this feedback from your test, take the time to understand what exactly is wrong, and fix it. This way tests help you from early on to separate concerns by constantly providing feedback about the system design.

Because tests given you loads of design feedback, writing tests after the fact is not really useful. If you get this feedback after you wrote the production code, you either have to rework the production code, or worse, ignore the design feedback. Writing tests after the fact therefore makes tests more likely to fail even on minor changes to the system, and thus they support but our primary purpose of tests: the ability to refactor!