Friday 24 October 2014

Unit Testing Evolution Part II – Naming Conventions

Well over 4 years ago I fell into the very trap I once heard Kevlin Henney warn me about: don’t append “Part I” to the end of a title as it suggests there will be further parts, but there may not. At the tail end of “Unit Testing Evolution Part I - Structure” I described how I had tried to short-circuit some of the tediousness of writing unit tests by skipping the naming aspect, because it’s hard. One of the most important things I learned from that talk by Kevlin (and his Sticky Minds articles on GUTs) was how important this particular aspect is. No, really, I honestly cannot believe I was once so naive.

Test Descriptions

The reason we call it a test “name” is probably because the tooling we use forces us into partitioning our tests as functions (or methods) and uses the function name to identity it [1]. Consequently we associate the function name with being the test name, but really what we want is a description of the test. In Kevlin’s talk he suggested that test descriptions should be propositions, they should attempt to say in as clear a manner as possible what the behaviour under test is.

Why is this important? Because this step, which is almost certainly the most difficult part of the task of writing a test, is the bit that focuses the mind on what exactly it is you are going to do in the test. In a TDD scenario it’s even harder as you’re not trying to describe some existing behaviour but trying to describe what the next thing you’re going to implement is. Going back to my original post, what became apparent was that instead of listening to my process (finding it hard to describe tests) I avoided the problem altogether and created a different one - tests with no description of what was and wasn’t covered. As such there became no way to know whether any behaviour was accidental or by design.

Make no mistake, at this point [2], the amount of thought you put into naming your test is crucial to you writing good tests, of any sort. If I cannot easily think of how to describe the next test then I know something’s wrong. Rather than just stick a placeholder in I’ll sweat it out because I now appreciate how important it is. In my mind I might mentally write test and production code to try and explore what it is I’m trying to unearth, but I try and resist the temptation to write any concrete code without any clear aim so as to avoid muddying the waters.

One nice side-effect of putting this effort in (which Kevlin pointed out in his talk) is that you will have a set of propositions that should read very much like a traditional specification, but scribed in code and of an executable nature to boot.

First Steps

When you first start out naming tests it’s hard, very hard. It’s probably even harder if you’re doing it in a language that is not even your native tongue. But there are some conventions that might help you get on the right track. Just remember, these are not ideal test names for other reasons that we shall come to, but when you’re struggling, anything that can help you make progress is a good thing. Even if it stops you writing “Test 1”, “Test 2”, etc. you’ve made some headway.

Note: One of the concessions we’re going to make here initially is that we will be writing test names that must also masquerade as method names, and so the format will naturally be restricted to the character set allowable in identifiers, i.e. no punctuation or spaces (in many popular programming languages).

Where we left off at the end of my original post was how the test structure had changed. From a naming perspective I had conceded that I needed to put more effort in but was perturbed by the apparent need to express the test name in a way that could be used to name a function. Consequently one of my unit tests in my Core library for trimming strings was expressed like this:

TEST_CASE(StringUtils, trimWhitespace)
{
  . . .
}
TEST_CASE_END

Even for a simple function like trim, which we all know roughly what it does, this name “trimWhitespace” says nothing about what it’s behaviour actually is. It raises many questions, such as “what exactly denotes whitespace?” and “where does it trim this space: the front, the middle, the back?”. Clearly I needed to find a better way to name the tests to answer these kinds of questions.

One of the earliest examples I came across was by Roy Osherove (who later went on to write The Art of Unit Testing) and he suggested a template along the lines of “<Method>_When<Condition>_Should<Action>”. Anyone who has ever done any BDD or used a tool like Cucumber or SpecFlow will recognise this as a sort of variation of the Given/When/Then style. Personally I found this ordering harder to use than “<Method>_Should<Action>_When<Condition>”, this variation just seemed a bit more natural. Using the Trim() example again I might have gone for something like these (I’ve given them both for Roy’s and my preferred style):

Trim_WhenStringContainsLeadingSpaces_ShouldRemoveThem
Trim_WhenThereIsTrailingWhitespace_ShouldRemoveIt

…or:

Trim_ShouldRemoveSpaces_WhenTheyAppearAtTheBeginning
Trim_ShouldRemoveTrailingWhitespace

One of the reasons I found putting the predicate at the end easier (the _WhenXxx) was that it can be elided when it’s not required, such as in the last case. I found that for really simple methods I was straining to add the predicate when it was already obvious. Of course what seems obvious today may not be obvious tomorrow so once again I had to think real hard about whether it should be elided or not.

Word Spacing

One of the problems with trying to write long function names in a camelCase or PascalCase style is that they can get pretty hard to read which is defeating the purpose of using descriptive names in the first place. Consequently judicious use of underscores and dropping the capitalisation can make test names easier to read, e.g.

trim_should_remove_spaces_from_the_beginning

Some test frameworks acknowledge this style and will replace the underscores with spaces when reporting test names in their output to complete the illusion.

Test Behaviour, Not Methods

Once I started to get the hang of writing more precise test descriptions I was ready to tackle that other big smell which was emanating from the tests - their affinity with specific functions or methods. When testing pure functions the “act” part of the “arrange, act, assert” dance will usually only involve invoking the function under test, but this is not the case with (non-static) classes [3].

With classes there is a call to the constructor in addition to any method you want to test. Or if you want to test the constructor you probably need to invoke some properties to sense how the object was constructed. Factor in the advice about only testing one thing at a time and this way leads to madness. What you really want to be testing is behaviour (nay, features) not specific methods. There will likely be a high correlation between the behaviour and the public methods, which is to be expected in OO, but one does not imply the other.

Kevlin’s article on GUTs says this far better than I ever could, but for the purposes of this post what we’re interested in is the effect it has on the way we name our tests. Most notably we shouldn’t be including the exact name the method in the test, unless it happens to also be the best way to express the behaviour under test. That might sound odd, given that we try and name our functions and methods to clearly express our intent, but what I mean is that we don’t need to contort our test name so that our method name is used verbatim, we might use a different tense for example. Of course if we find ourselves using an entirely different verb in the test name then that is probably a good sign that we need to refactor. If we are doing TDD then we have probably just answered another piece of the puzzle; such is the power given to us by trying to name tests well.

Constructors are a good example of where this test/method affinity problem shows up. When writing a test for an object’s default state, if you use a style that includes the method name, what do you use for it? Do you use the name of the type, the word “constructor”, or the method used to observe the state? Consider a test for the default construction of a string value. This is what it might look like under the earlier method-orientated styles:

constructor_should_initialise_itself_to_an_empty_string_when_no_arguments_provided

string_should_initialise_itself_to_an_empty_value_when_no_arguments_provided

length_should_return_zero_when_constructed_with_no_arguments

The last example gives us a big hint that we’re looking at the problem the wrong way. The first two test names describe the how the implementation represents the default value, which in this case is “an empty string”. But we know that there is more than one way to represent an empty string (See “Null String Reference vs Empty String Value”) and so we should not burden the test with that knowledge because it’s a proxy for a client, and they don’t care about that either.

That last test name gets us thinking in more abstract terms and leaves it up to the test itself to show us how it could be realised, e.g.

public void a_string_is_empty_by_default()
{
  string defaultValue = new string();

  // Could be either (or both) of these…
  Assert.That(defaultValue.Length, Is.EqualTo(0);
  Assert.That(defaultValue.ToString(), Is.EqualTo(“”);
}

Writing code in a test-first manner really started to make this kind of mistake all the more apparent. I found that trying to express what behaviour I wanted to achieve before I had written it highlighted the inadequacy of the naming convention I had been using up to that point.

One other point about not being directly dependent on the function or method name is that refactoring can have less of an impact on the structure of the tests. In theory if you changed the name of the method you have to manually change the name of the associated tests as I’m not aware of any tooling which is that clever. If you move the implementation, say, from the constructor to a method you then have to rewrite the associated tests, and therefore the test names. The definition of refactoring is to change the design without changing the observable behaviour, and so if your tests and names are framed around the observable behaviour, rather than the implementation, you should not need to change anything.

Towards a Natural Language

The final step in my journey was to move towards a test framework that allowed natural language to be used instead of a limited character set for test names. This also paved the way for use of punctuation marks which helps give the test name a more human feel. The following example is what the unit test for my trim function looks like now:

TEST_CASE("trim strips spaces, tabs, carrage returns and newlines from the front and back of strings")

  . . . 
}
TEST_CASE_END

In retrospect this test case could probably be broken down a little further, perhaps by handling trimming of leading and trailing whitespace as separate cases. The reason it’s all one TEST_CASE() comes down to how I moved from the old style “TEST_CASE(StringUtils, trimWhitespace)” to the new style “TEST_CASE(trims strips...)”. As you can probably guess I just used SED to go to an intermediate step where the test name was simply “trimWhitespace” and then I slowly went back over my tests and tried to work out what a better description would be.

The trade-off here was that I gave up the ability to use the test name as a function or method name, but that’s of little use in C++ where there is no reflection capability [4]. The way it’s currently implemented means running a single test from the command line would require some thought, but I know that other more mature test frameworks, like Phil Nash’s Catch, have come up with something usable. At the moment I’m happy to treat the need to run a test in isolation as a smell that my tests take too long or there are too many tests in a single fixture.

No Pre-Processor?

In C & C++ we have the (much derided) pre-processor to help us implement such a convention, but there is no such facility in C#. However, there has been a recent pattern emerging in C# where the indexing operator is “abused” to allow you to register closures that look like long-named methods, e.g.

public class SomeTests : TestBase
{
  public void Tests()
  {
    Test[“a nice free form name”] = () =>
    {
      // Arrange, act and assert.
    };
  }
}

I like this idea as tests usually have no return value and take no arguments so it doesn’t look that weird. One should also remember that test code and production code are allowed to be different and may use different idioms because they serve different needs. When I started out writing my C++ unit testing framework I had one eye on writing a library in a production code style instead of taking a step back and trying to write something that solved the problem for a specific domain. Part of this journey has been understanding what writing tests is all about and in particular finally understanding the importance of naming them well. Once again I find myself having to learn that there is no such thing as a free lunch.

 

[1] Despite what I’ve written I’m going to stick with the term “test name” purely for SEO reasons. I’d like this blog post to be easily discoverable and that term I suspect is the one most commonly used.

[2] These are weasel words because I cannot guarantee someone like Kevlin will not come up with something even better in the future.

[3] A function like strtok() in the C standard library which maintains some hidden state would require extra calls to it in the “arrange” part of some of its tests, but these really should be the exception, not the norm (See “Static - A Force for Good and Evil”).

[4] There are some handy macros though, like __FUNCTION__, that could be used along with some other pre-processor trickery to help obtain the function name automatically.

1 comment:

  1. Thanks for the shout-out to Catch :-)
    As for using free-form strings as test names in .Net I'd recommend writing all your tests in F#
    No seriously!
    F# allows you to use free-form strings as identifiers by placing them in back-ticks - and test frameworks take advantage of this.

    ReplyDelete