Tuesday 26 July 2016

Documentation, What is it Good For?

If, like me, you grew up in the 1980’s with Frankie Goes to Hollywood you’ll instantly want to reply to the title question with “absolutely nothing!”. You may also pick the same answer if you misread the Agile Manifesto as saying that we should “create working software, not write documentation”.

Of course everyone knows it really says that we should favour “working software over comprehensive documentation”, and yet it feels as though many are replacing the word “comprehensive” with “any”. Whereas before we tried to get out of writing documentation because it was boring (hint: that’s the fault of it being comprehensive), we can now use the Agile Manifesto to reinforce our opinion that we should go full steam ahead and only write code because documentation is either pointless or of very little value.

The YAGNI Argument

One of the reasons documentation is seen as having very little value is down to the fact that apparently “no one reads it”. There is more than a grain of truth in this argument and it’s something I’ve touched on before in “The Dying Art of RTFM”. Also, as is probably apparent from the “Pen & Paper” article in my Toolbox series for ACCU’s C Vu, I like to make notes in a log book about the work that I’m doing as I find them useful later.

Hence my premise around documentation, and this has always applied equally to writing tests and tools too, is that if I’ll find it useful later I suspect that someone else will too. Therefore, first and foremost I write documentation for myself, and that gets us over the YAGNI hurdle. So, even if nobody else is going to read what I write, the fact that I’ll probably read it again is good enough for me to consider it worth the effort (and therefore it has at least some value).

This blog is a prime example of such an outlook. I have no idea whether anyone else will read anything I write, but what I have discovered (like so many others) is that the act of writing is in itself valuable. Naturally one must be careful not to compare apples and oranges as using the company’s time to write documentation has a definite monetary cost compared to the cost of writing this blog which would be mostly valued in time instead.

Certainly in my earlier days of documenting aspects of a project, I didn’t know what kinds of things were worth writing up and I almost certainly still wrote way too much up front. In part what I wanted to write was not just about the topic itself, but often the rationale that lead to the decision. I felt that without the rationale you wouldn’t know if the documentation still held any value in the future as what lead you to that original decision may no longer be valid, consequently you’d know when it was time to delete or rewrite it.

Keep It Real

When I first read the term “comprehensive documentation” what came to mind was one of those massive enterprise-grade masterpieces that has at least half a dozen introduction pages covering the table of contents, version list, signatories, disclaimer, etc. If the actual content doesn’t even start until page 12 you are pretty much guaranteed that no one will ever read it. Also making use of fancy technologies like OLE so that any embedded content is subject to the whim of the Network Gods is also not going endear people to it.

I, like many other developers, like simple documentation – simple to read and, more importantly, simple to write. The barrier to writing documentation should be low, really low, so low that there is very little reason to not do it. By taking away all the superficial formatting, e.g. using Markdown, it’s almost impossible to get bogged down with playing with the tool instead of producing what matters – the content.

Hence my tool of choice is a simple wiki. Whilst storing text documents and any simple images in the source repo itself is better than nothing (or going way over the top) GREP is just a little too crude for searching, and having page navigation just makes life that little bit more joined-up. I’ve find that GitHub makes this format very usable, but what would make it better is if there was even a half-decent, simple Markdown viewer for Windows [1].

The great thing about plain text formats is that you can still use your favourite text editor and most support a spell checker plugin to at least avoid any obvious mistakes. Trying to read badly written prose can be jarring for the reader, but I’d still prefer to have the content than not at all as someone can always fix the little mistakes when it gets updated. With a browser like Chrome (which comes with a built-in spell checker) even the basic edit boxes used in online wikis is good enough [2] for creating most content.

Naturally plain text is good for the written parts but sometimes you need an image, such as a diagram. ASCII art can work for a very simple picture but you really want something just a little bit better. My current favourite online tool is draw.io which makes it really easy to create simple, effective diagrams. You can then generate a .PNG image which you embed in the page and also export the diagram as an .XML file for attaching (or checking-in). This way anyone can edit the picture later as they have the source. Personally if the diagram would take a long time to recreate from scratch then I’d question if it was too detailed in the first place.

Common Themes

Most of the documentation I write is not about the features of the product – you have the stories, source code and tests to tell you about that. The things that I tend to write up are the aspects of the process the team uses and many of the little things that we perhaps don’t do or consider every day but act as examples of how some development or support tasks could be done.

Basic Architecture

As I mentioned earlier I feel one of the most important things the team can jot down is the rationale around why things are the way they are. Architecture is a prime example of where knowing how you got there helps you understand the forces that drove the decisions – even if its that no conscious decision was ever made.

There is a slightly meta aspect here too as I like to start by linking to two sources on why I’m even doing it: “Record Your Rationale” from 97 Things Every Software Architect Should Know and Michael Nygard’s blog post on “Architecture Decision Records”.

Generally speaking I’ll only create a simple top-level architecture diagram to show the basic footprint and provide a quick description of what the major components are. This usually provides enough to orient someone and can act as visual clue should you be chatting to someone outside the team about the system [2].

Team Details

I also like to create a simple page that at least lists the team along with some of their contact details. When all you’re given is a cryptic login like THX-1138 it becomes a pain trying to find out who owns a file or which developers are hogging the two RDP connections to a Windows server. It also provides a place to put a mugshot and some other details which can provide a more welcoming feel to other teams; I prefer to belong to a team that openly promotes collaboration rather than tries to hide behind a ticketing system.

Operational Tips & Tricks

While we might wish that our systems are perfect, have 100% uptime and never do anything weird, that just isn’t a reality. The idea of having a knowledge base for the support and operational aspects of a system is nothing out of the ordinary, but we need to balance documenting quirks with just getting on and fixing the bugs that cause regular support issues.

For example although an in-house system should be able to come up in any order there might be a preferential one that just makes life easier. Also any post mortems that have been carried out are useful to document so that not just the team itself but the whole organisation can learn from accidents and failures.

Development Guides

Operational notes are probably a fairly obvious artefact but I like to do the same for development tasks too. In an enterprise setting you often find junior developers being given what are seen as “simple” tasks but no guidance on how to do it. You can’t automate everything and sometimes they need some background knowledge to help them along [3].

I’ve also found that during development you bump into various quirks either around the toolchain or dependencies that are worth making a note of. For example Visual C++ developers used to run into pre-compiled header build problems with a drive mapped using NET USE that didn’t happen with SUBST.

One-Liners

On my desktop I always have a text file called “one-liners.txt” that I use to store those little command lines that come up time-and-time again at each client. They often revolve around build or support queries, such as parsing data out of log files or test data files. If the command remains useful it can be “scriptified” but before then I like to share where I’m up to so that other developers can learn how to compose the standard tools (or PowerShell cmdlets) to solve the smaller tasks. Many Windows developers aren’t comfortable at the command line and so I think this is a great way of showing them what they’re missing out on. Once a task gets turned into a script it becomes opaque and just “the solution” whereas I find the “raw pipeline” teases at another, different world of programming.

Links

Whilst StackOverflow probably caters for a lot of the Q&A these days that I might historically have documented, sometimes the error message and the solution don’t always immediately tie up and so I like to fill that gap by having a simple Q&A page that links to SO but with the actual error they’re likely to encounter.

Junior developers often don’t know how to separate the wheat from the chaff and so links to articles of special engineering interest, blogs themselves of notable luminaries, essential books, etc. are all worth capturing. Part of the rationale that drives a solution comes from the ambient knowledge of its team members and so therefore knowing what shapes them can help understanding of what has shaped the application.

Summary

Contrary to what you might have heard, not all documentation is waste. In a stable team where there is plenty of opportunity to learn from the code and by pairing you can get away with less, but there are still going to be things you want to remember or share with your future team mates.

Don’t set out to write War & Peace – grow it organically. If you’re not sure if it’s worth sharing it’s better to stick a crude one-liner with a simple note on a page, which someone else can then embellish and refactor, than to never do it at all. Keep your investment small until you get a feel for where the true value lies.

One of the tasks I like to include as part of the “Definition of Done” is around documenting anything that might be of interest. This means that writing documentation becomes just another part of the development of a story, like writing tests. Most of the time there is nothing to be done but at least documentation then becomes part of the conversation around every feature.

Finally, don’t worry if some people don’t seem to read what you write, it’s unlikely that it’s undiscoverable or abhorrent, it’s probably that they just don’t work that way. If whatever you’ve written is valuable to you then that’s a good start, hopefully your current and future team mates will find value in it too at some point when they go looking for answers [4].

 

[1] For some reason everyone wants to be a Markdown editor, but nobody wants to be a simple viewer and sadly the Chrome extensions aren’t friction-free either.

[2] If it’s a longer conversation then a whiteboard session is probably more suitable than staring at a static image.

[3] I once wrote a guide called “Efficiently Merging ClearCase Branches” as developers kept getting the order wrong around bumping the label in the config spec and merging the changes. This created huge amounts of noise and in the repo through unnecessary versions.

[4] On more than one occasion it wasn’t until after I had left a client and then met up for drinks sometime later that I got thanked for leaving behind useful notes. Hearing that kind of feedback makes it all worthwhile.

Thursday 21 July 2016

Developers Can Be Their Own Worst Enemy

Generally speaking I reckon we programmers are an optimistic bunch. By-and-large we also want to do a good job and take great delight in seeing our work unleashed on our intended audience. I also believe we do our best to understand the constraints of the business and work within them when delivering our solution. Like any normal person we also want to be acknowledged and respected. But we also like to tinker.

A New Age of Transparency

In past times, when the mentality was more Waterfall-esque, it was much harder to get late ideas in, and so the consequence was that developers might shoehorn one feature in on the back of another. This saved on paperwork and also gave us a warm fuzzy feeling that we were in control of the product’s destiny.

But it’s no longer like that, at least it’s been a long time since I’ve worked in anywhere near such a rigid framework. The modern development cycle means that the next sprint planning meeting is only a week or two away. The product owner, who should now feel that they have a good idea what is going on day-to-day, is far more amenable to localised plan adjustments [1] so the notion of widening the scope slightly or even squeezing another story in earlier is only a conversation away. Even if it doesn’t get the go-ahead immediately there’s a backlog that anyone can add features to and all the stakeholders (which includes the development team) can provide input to determine the ordering.

Maybe though that’s just not enough for some people. There are developers that still feel the need to act on highly speculative requirements. They are so very sure that we are going to need to do “X” that they will start coding the feature up straight away. After all, they have the engine open on the workbench so it’s got to be more efficient this way, right?

At Number 5 in his “11 Agile Myths and 2 TruthsAllan Kelly has:

Developers get to do what they like

The truth hurts, and much as I wish it wasn’t this way, Allan Kelly is right, we cannot always be trusted. The years have taken their toll and until we can show that we are able to be grown-ups we must accept that we still need to hold somebody’s hand all the way down to the sweet shop.

Continuous Learning

After 20 years writing software professionally you would probably have hoped that I would know what I’m doing. Perhaps if I took a different path than the Journeyman route I could easily delude myself that I now know the one true way and therefore I will know how every solution is going to end up. As such, even though I might not know the exact problem domain surely I must know all the other stuff like architecture, design, testing, debugging, logging, monitoring, documenting, etc.?

This line of reasoning is probably what gets us into hot water. In the back of our mind we believe that because we’ve needed X before that we should have X again. Not only that, but we should also do it the same way we did X last time. And why shouldn’t we, after all isn’t that what they’re paying us for – our experience?

While this might be true, they are also probably paying for us to not reinvent the wheel and to make efficiency gains by adopting modern tooling and practices. Our experience of knowing how to solve problems and where to look for help is probably more valuable than the specific solution knowledge we have. For example the very brief time I spent writing Eiffel 20 years ago has never been of direct use, but the knowledge of Design by Contract has been generally useful. Less than a decade ago I was still writing manual “for” loops and now its hardly ever. If such a basic programming construct can change why wouldn’t I assume everything else will too, eventually?

Do or Do Not

To be clear this is not about being given the solution on a plate and simply following it, definitely not. As technologists the solution domain is our space to shape and mould as we see fit. We are masters of that domain, but everything we do in it must ultimately be to serve the greater good of the business. We have plenty of room to make our mark without trying to play the hero or getting “one up” on those “technically illiterate business people”.

It’s also not about stifling innovation either. Good ideas need to come from all areas of the organisation, not just the business side. In fact having an appreciation of the technical constraints probably means we’ll filter out some of the more challenging ideas before presenting them on the grounds that we suspect they’ll never fly. This is equally undesirable because we don’t want to miss opportunities either.

No, what I’m talking about here is the half-baked features that you comes across when trying to implement another change. Sometimes it’s just a stub or bit of logic left as a placeholder for an as yet unimplemented feature. Or maybe it’s some behaviour that the author thought might be useful but didn’t rip out again. And guess what, there are no tests and the story it’s checked in against is for something else. Now you’re into software archaeology mode trying to work out how this relates to what you thought the code should be doing. There aren’t many developers I know who have the confidence to just rip stuff like this out, so it often lives for far longer than you’d hope.

The reason is that all these little bits of speculative features are just waste. For a start while you had gone off-piste the feature that should be being delivered is now delayed which means it’s costing money (i.e. it’s inventory) not generating income. When it interferes with the delivery of any other feature by becoming a distraction it once again causes delay. If someone keeps having to ask or work out why something lying around exists the cost will slowly mount up.

Have Faith in the Process

You’re probably thinking that I’m being somewhat overly melodramatic and in a way I am. However “death by a thousand cuts” starts with only one cut and continues growing. By not keeping it under control (or better yet removing it) the problem remains unchecked. When you’ve worked on a clean codebase where you don’t keep having to second guess what the code says and does you can get into a rhythm that allows new changes to be delivered really quickly. XP talks about having courage and a codebase with no superfluous crap goes towards giving you that courage.

Our best work happens when we get to practice software engineering. Whilst it’s our job to trade-off time and space so that we provide the best solution we can whilst keeping costs down, we should do so with the combined efforts of those we work with, both on the technical and business sides. It’s often hard to associate a monetary value with what we do, but doing that really helps you start thinking about your everyday actions in terms of the “cost of delay”.

So, the next time you’re tempted to just squeeze a little something else in, ask yourself whether the team would be better served by first finishing what’s already on your plate. If you really think it can’t wait then quickly canvas the opinion of a few stakeholders and/or the product owner to see whether it fits in with the current plan.

Better yet start pair programming. One of the best ways I’ve found to stay focused on the task at hand is by having someone else working with you. Whilst it’s entirely possible that the pair could wander off on a tangent I’ve found it extremely rare in practice. What normally happens is that one side begins to feel uncomfortable about the changing direction and a rapid exchange occurs to decide if this is really where they both want to go or whether it should wait for another day. It’s much easier to remain honest (if that’s what you need) when you have somebody else sat right next to you whose time you also have to account for.

 

[1] This is what your daily stand-up is for – reprioritisation.

Monday 18 July 2016

Simple Language, Simple Process

There was a change made to the Scrum Guide a few years ago around the language used to describe what the “most important” thing is. Previously it was okay to talk about “prioritising the backlog” or “what order we should tackle the work” as these terms were synonymous.

But not any longer it seems. Now the terms “priority” and “order” mean different things, i.e. the priority of the product backlog does not necessarily define what order we tackle the work. If this sounds odd to you then you are not alone. While in theory the distinction sounds useful, in practice I’ve only found it to create more friction.

Pointless Pedantry?

The reason for introducing the subtle distinction was outlined in “Ordered Not Prioritized” by James Coplien. Anyone who has worked in a team where there has been tension over how to order the backlog so that work is done efficiently, which might not be the same order that the business would like, will know where Jim is coming from.

Whilst there is nothing intrinsically hard to change about software, as it is after all essentially non-physical, there are still constraints, such as complexity, which dictate how easy it is to achieve. The non-technical folk also may not understand that there are still limitations in the tools and materials that we work with. For example one tool may make it easier to get something simple functionally complete quicker, but its inability to scale adequately may make it the wrong choice for anything other than a prototype.

Balancing the business and technical priorities is tricky. They want features quicker, but we want to keep the quality bar high to ensure the product is able to support continued change. This is not an all-or-nothing proposition. Aside from the initial stages of development were we are mostly erecting the scaffolding that will support continuous delivery, the bulk of the work can be a mixture of the two.

I’ve been in the situation before a few times where there have been attempts to try and resolve the business and technical priorities by splitting up the backlog. This just doesn’t work, no matter how you slice-and-dice it there can only be one top priority. Making the team decide whether they should pick something off the business or technical backlog is not honouring the “negotiation” aspect that the agile manifesto promotes.

Hence, there almost certainly should be one, and only one backlog to ensure that everyone on both sides of the fence (business and technical) understand what we’re building in the shorter term and what’s further down the line (subject to renegotiation at any point). Having a single stream of work makes it really easy to see at a glance what the current plan is.

It’s All Just Work

Whilst Scrum might be straining the language in terms of priority at least it keeps the units of work simple – items. Over the years there has been a proliferation of ways to describe ever larger chunks of stuff to do from tasks, to stories, to epics and eventually to themes.

I’ve overheard conversations where people get hung up on the relative sizes of each of these bundles in the hope that they will be able to estimate the entire project delivery by just summing up the various factors. This naturally leads to a proliferation of charts showing the burning up (or down) of each of these units to help maintain the illusion that we are in control of the project and it’s progress and that we’ll meet whatever arbitrary deadline has been decided.

Breaking work down into small, manageable chunks is what matters. Each chunk of work should deliver some value, and by keeping them small we give ourselves room to manoeuvre and change direction (within limits) to meet the ever changing world around us. Just as when talking about the software architecture we tend to talk at different levels of abstraction, and so we can do the same with our features. This allows us to remain focused on, say, the bigger picture and not get unnecessarily bogged down in details which are superfluous at that level of granularity.

Many of the techniques around planning and estimation are largely exercises in getting you to break the problem down and identify risks to delivery. As long as the team is able to talk about the work and know at what “level” the current discussion is taking place, they should avoid getting distracted by arguments about whether any given feature is a task, story, epic, etc.

Back to Basics

When we cut away all the new terminology that has arisen over the last couple of decades we get back to a very simple development process. In essence we have a list of things that the business would like us to do, along with some other stuff that the implementing team also need to do, in order to ensure all this work happens efficiently.

That “to-do” list will contain work of varying size and complexity and both the business and technical sides will endeavour to break it down into small enough chunks that they can interleave them to ensure their respective needs are continually being met. The team will also periodically reflect on their working practice to help them discover even better ways of working.

And that’s pretty much it. Ultimately what we’re trying to achieve is very simple and yet the language often gets in the way and causes us to worry more about the process we’re following instead of what actually matters – creating working software.

Wednesday 13 July 2016

Alternate Way to Mock the Current Time in C#

My previous post “Mocking the Current Time in C#” raised a few objections due to it’s use of a Static Façade that mirrors the underlying DateTime’s interface. Despite also disliking Singleton-like behaviour in general, I very occasionally find it acceptable [1]. In this instance reading the clock has no side-effects and the previous mocking technique is delightfully simple, but it’s not the only way. I have also worked on teams which prefer not to hide any dependencies behind static interfaces and in that situation I’ll adopt a more traditional mocking approach.

The Clock Object

The first thing to define is an interface for our Clock object:

public interface IClock
{
  DateTime Now { get; }
}

Now we can define an implementation for the real clock that we’ll use in production:

public class RealTimeClock : IClock
{
  public DateTime Now
  {
    get { return DateTime.Now; }
  }
}

So now, instead of hitting the hidden Clock façade, we’ll make our dependency more explicit by passing it via the constructor [2]:

public class DateCalculator
{
  public DateCalculator(IClock clock)
  {
    _clock = clock;
  }

  public int AgeToday(DateTime birthday)
  {
    . . .
  }

  private readonly IClock _clock;
}

Now we can make use of our clock in the real-world like this:

var clock = new RealTimeClock;
var calculator = new DateCalculator(clock);
. . .
var age = calculator.AgeToday(my.Birthday);

We can either hold onto the clock and calculator objects and reuse them (as they’re stateless) or recreate them each time. It’s likely that we won’t create them locally but will pass them down from somewhere above (as per “The PFA Papers”).

The Mock Object

Mocking the clock to test our date calculator is now as simple as writing any other test. In this example I’m defining a manual mock, but you may prefer to use a mocking framework instead [3].

public class FixedTimeClock : IClock
{
  public FixedTimeClock(DateTime time)
  {
    _time = time;
  }

  public DateTime Now
  {
    get { return _time; }
  }

  private readonly DateTime _time;
}

[Test]
public age_is_number_of_years_since_birthday()
{
  var today = new DateTime(2016, 6, 1);
  var clock = new FixedTimeClock(today);
  var calculator = new DateCalculator(clock);
  var birthday = new DateTime(2000, 1, 1);

  var age = calculator.AgeToday(birthday);

  Assert.That(age, Is.EqualTo(16));
}

Defaulting the Clock

The purpose of doing this was to make the clock an externally controllable dependency, which is why we made it an explicit argument to the DateCalculator constructor. However, despite the protestations about not using a static façade, I’ve then seen developers be happy with overloading the constructor so that you can avoid passing the real clock around everywhere in production code:

public class DateCalculator
{
  // Only used in production
  public DateCalculator()
    : this(new RealTimeClock())
  { }

  // Only used by unit tests
  internal DateCalculator(IClock clock)
  { . . . }

  public int AgeToday(DateTime birthday)
  { . . . }

  private readonly IClock _clock;
}

Another variation on this is to make the constructor that takes an explicit clock “internal” to show that it’s essentially just used for testing. At this point I have to ask again what the original objection to the Static Façade approach was, because we’ve just hidden the dependency again – it’s now only visible in unit tests – which is effectively no different to the previous approach. If you prefer to be explicit then it should be to make the production code, not test code, more transparent.

 

[1] So maybe I was off-by-one in “The Application-Wide Diagnostic Log - the One Acceptable Global Object” and there are really two acceptable global objects :o).

[2] Sorry but I can’t bring myself to use that sensationalist term “DI”, see “Terminology Overdose”.

[3] Just be aware of how they behave when you don’t configure them properly, see “Mocking Framework Expectations”.

Thursday 7 July 2016

Mocking the Current Time in C#

In my recent post “Testing 101 - A + B = C” I mentioned about mocking out calls to do with the current date & time to bring some determinism to those sorts of tests. I find this problem comes up any time you want to write tests around a thing that expires, or where you might be doing date arithmetic such as calculating someone’s current age.

I touched on my preferred solution somewhat tangentially back in 2011 when talking about a similar unit testing problem to do with the file-system (see “Unit Testing File-System Dependent Code”). However I decided it was worth writing it up explicitly as I’ve had a few conversations about it in the past and never been able to quickly point to something concrete.

A Level of Indirection

They say every problem in computer science can be solved with an extra level of indirection and that’s exactly what we need to do here. So, instead of using the DateTime.Now property everywhere:

var now = DateTime.Now;

…we instead use our own version:

var now = Clock.Now;

This might seem quite invasive, and in a sense it is, but historically to mock static methods involved specialist tools. If you just accept that you’re going to need to do this (and I’m sure you will if dates feature anywhere) you just do it from the beginning in a new project and it costs very little. Even retrofitting it later is easy as you can do it piecemeal if needs be, at least, to code that you own that is.

So, the default implementation can just chain onto the original property:

public static class Clock
{
  public static DateTime Now
  {
    get { return DateTime.Now; }
  }
}

With the abstraction in place we are now free to re-implement it so that we can replace the underlying behaviour to do whatever we want, such as return a fixed value in a test.

First we make the underlying function configurable by introducing another property to hold a delegate that we’ll invoke when Now is called [1]:

public static DateTime Now
{
  get { return NowImpl(); }
}

internal static Func<DateTime> NowImpl = () =>
{
  return DateTime.Now;
}

Naturally the default implementation does the right thing so that the production code will work out-of-the-box. However in a test we can now override it:

[Test]
public void the_current_time_is_always_fixed()
{
  var millenium = new DateTime(2000, 1, 1);

  Clock.NowImpl = () =>
  {
    return millenium;
  }

  Assert.That(Clock.Now, Is.EqualTo(millenium));
}

And that’s pretty much it. Normally I’ll add a Reset() method that can be called in the [TearDown] handler to put back the real clock just to keep things tidy and avoid a fixed date leaking out into another test.

Another Level of Indirection

You probably noticed that I marked the property that injects the mock function “internal” so that it has limited scope. I normally use the [InternalsVisibleTo] attribute to grant access to the test assembly. In the scenario where I can’t do that, such as when the Clock class resides in a shared assembly that is non-project specific (i.e. I can’t use [InternalsVisibleTo] to name an unknown assembly) I resort to a different trick.

Whilst I could give the NowImpl property a really log-winded name to discourage a developer from using it by accident, I prefer to keep the implementation private and then add a separate public Test API class to grant the necessary access. In C# a nested class has access to it’s parent’s internals and so I can do this instead:

private static Func<DateTime> NowImpl = () =>
{
  return DateTime.Now;
}

public static class TestApi
{
  public static Func<DateTime> Now
  {
    set { NowImpl = value; }
  }

  public void Reset()
  {
    NowImpl = () { return DateTime.Now; }
  }
}

This doesn’t really grant any more protection to the Clock internals but it does mean that IntelliSense won’t show up the other test methods unless you navigate into the TestApi class which feels a little cleaner.

Consequently the test would now look like this:

[TearDown]
public void ResetClock()
{
  Clock.TestApi.Reset();
}

[Test]
public void the_current_time_is_always_fixed()
{
  var millenium = new DateTime(2000, 1, 1);

  Clock.TestApi.Now = () =>
  {
    return millenium;
  }

  Assert.That(Clock.Now, Is.EqualTo(millenium));
}

 

[1] I could have collapsed the lambda down to a simple method group, but I prefer to make lambdas look like actual method bodies when explaining things. ReSharper will happily provide the relevant hint when you do it in real life.

Don’t Use Side-Effects to Verify Behaviour

I’m sure no one goes out of their way to do it, but it’s not uncommon to find tests that verify the behaviour by relying on a side-effect of the operation rather then sensing the actual operation itself. It’s easy to see how the rationale probably goes:

  • I need to test that an operation has happened.
  • The code under test logs when the operation occurs.
  • So, I can use the presence of the log message as proof.

Here is the kind of test that results from applying that logic:

[Test]
public void manually_refreshing_requests_an_update()
{
  var cache = new Cache();
  Assume.That(Log.Messages.Count, Is.EqualTo(0));

  cache.Refresh();

  Assert.That(Log.Messages.Count, Is.EqualTo(1));
  Assert.That(Log.Messages[0].Text,
              Contains(“cache updated”));
}

This example has come up numerous times in real life and is a prime example of the confusion between correlation and causation. In essence the test is saying that the http message was sent due to the log message being written, which is clearly nonsense.

The log message or event is probably highly coupled to the action that is under test and therefore is highly correlated with it, but that does not make it any more right to use it to sense behaviour. When developers complain about brittle tests this is exactly the kind of thing they dislike because a change in non-functional behaviour can cause a test for functional behaviour to break.

The test should should make it abundantly clear what interaction is expected to be taking place. If the side-effect is behaviour that needs verifying too, then it should be done separately. This is another problem I often see with tests – one test verifying two independent behaviours. I wouldn’t want to suggest it’s down to laziness per-se but “killing two birds with one stone” might seem an easier path to take in the short term.

Consequently I’d end up with two different tests – one to sense the interaction with the HTTP client and another to describe the side-effect.

[Test]
public void manually_refreshing_requests_an_update()
{
  var httpClient = new MockHttpClient();
  var cache = new Cache(httpClient);

  cache.Refresh();

  Assert.That(httpClient.Requests.Count,
              Is.EqualTo(1));
}

[Test]
public void an_event_is_logged_when_the_cache_updates()
{
  var cache = new Cache();
  Assume.That(Log.Messages.Count, Is.EqualTo(0));

  cache.Refresh();

  Assert.That(Log.Messages.Count, Is.EqualTo(1));
  Assert.That(Log.Messages[0].Text,
              Contains(“cache updated”));
}

You may notice that the original test made no reference to the HTTP Client, which may be because it’s a singleton mocked out elsewhere. As we didn’t interact with it directly in the test it wasn’t mentioned, but in my rewritten test I’ve brought the dependency front-and-centre to make it obvious what the interaction is with.

The second one is just the original test but with a better name to more actually describe the behaviour we are interested in. Now we can change the behaviours a little more easily. Yes, the second is still coupled to the behaviour in the first, but the first is no longer tied to the behaviour in the second.

Testing 101 - A + B = C

When writing production code the functions with the lowest coupling / easiest to reason about are (generally speaking) pure functions. They only depend on their inputs and there are no side effects either which means that given the same inputs you should always get the same output. But this doesn’t just have to apply to production code, in fact it needs to apply to test code and the environment too.

I’m sure a proper mathematician will snarl at my simple analogy but the equation “A + B = C” is how I expect tests to behave. That is to say: given the same inputs and behaviours I expect to get the same output. Always. In essence I expect the tests to be as dependable as simple arithmetic.

This simple premise forms the bedrock on which all tests sit because you cannot (easily) make progress on your software project if there is a lot of noise going on around you. Every time a test fails for seemingly innocuous reasons you will be distracted from your goal. Even if you choose to completely ignore the failing tests your build pipeline will stall every time it happens and so you will still be distracted even just to keep re-starting the build! In my experience it pays to cure the disease rather than just keep treating the symptoms.

Removing sources of non-determinism in tests is definitely one problem I find myself tackling fairly regularly. Trying to ensure that A and B are the only inputs into the test, along with ensuring that A and B are always the same, is a common source of noise. In particular code and tests where date/times or concurrency is involved is likely to suffer from random distracting failures.

The other aspect of testing that this little mathematical equation is intended to convey is simplicity. The number of variables is consciously small and there is only a single operator which provides the point of focus. While you might think that this implies a discussion about low-level unit tests, it can equally apply to other forms of automated tests too. Where A and B might be simple values in a unit test they might be a single, larger compound value in an acceptance test. Similarly the operation may be just a single method in the former case and a RESTful API call in the latter. The increase in size and complexity can be offset (to a certain degree) by application of the Separation of Concerns for the scenario.

Environment

One of the first invariants to consider is the environment. It helps if you can keep it as stable as possible across test runs – if you keep switching the version of the toolchain and dependencies on each test run how will you know whether the failure is down to your change or someone in the background?

Obviously it’s not normally that blatant but there are common differences that I see developers keep getting tripped up on. One is the build server using a fresh checkout versus the developer’s workstation which often has piles of detritus building up because they don’t realise what their IDE leaves behind (see “Cleaning the Workspace”). Another is using a different, “compatible” test runner from within the IDE whilst the build script uses the real packaged command line tool. If you don’t run the same build script before publishing your changes then you’re faced with all the differences between the build server and workstation to consider when it breaks.

Many development practices have changed over time and the notion of a stable working environment is one that is now being heavily challenged. The use of cloud based build services like Travis and AppVeyor means you may have much less control over all the specific details of the build environment. Also the pace at which 3rd party packages change, such as those from NuGet or NPM, along with the number we now depend on, means that any two builds may have used different dependencies. Sitting on the bleeding edge comes at a cost so be sure you know what you’re getting yourself into.

The analogy I favour most when it comes to changing any invariants such as the build environment is to follow the advice of the rock climber – only move one thing at a time. So I’m told, a rock climber always keeps at least 3 limbs attached to the rock face and therefore only ever moves one at a time. When the moved limb is secure, only then do they move another one. I’m sure that experienced rock climbers do break the rules, sometimes, but they do it knowing what the full consequences are.

Avoiding Non-Determinism In Tests

With our environment behaving in a predictable manner we have reduced the potential for noise so that the first port of call for our test failures is the production code or test itself. Unless you are writing something inherently non-deterministic, like a true random number generator, then it should be possible to control the production code in a way that allows you to test it in a well-defined way.

This isn’t always easy, but it starts by not trying to test behaviours which are inherently unreliable, such as waiting for a fixed amount of time to pass, or an operation to run on a background thread. Whilst these operations may complete fairly quickly on your super-fast developer workstation, your build server will likely be a VM sharing a hugely overloaded box and so CPU cycles are somewhat scarcer.

My preferred method of dealing with such behaviours is either to mock system calls, such as getting the current time, or to use synchronisation objects like manual events, semaphores and countdown events to sense the transitions through mocks. If I’m scheduling work onto other threads I might mock the scheduler (or dispatch function) with one that just runs the task directly on the same thread so that I know for sure it will complete before the method returns.

The mistake I think many developers make is that they believe it’s possible to test many concurrent behaviours reliably. Whilst you might be able to prove the presence of a deadlock by testing it continuously for a period of time, you cannot prove the absence of it that way (Dijkstra taught us that) [1]. The cost is longer running tests and therefore the chances that they will be run less often. Consequently I prefer to mostly write tests that characterise the concurrent behaviour, rather than attempt to prove the concurrency aspect is correct. That I leave for testing in an environment which is much better suited to the task.

Just recently I came across a test that failed one out of every 10 attempts, but when I added another test the failure rate went right up. The reason the original test failed at all was down to the spinning up of a background thread to do a cache refresh which occasionally took longer than what the test was prepared to wait. Waiting an arbitrary amount of time is generally a big no-no, you should try to synchronise with the thread somehow. In this instance I could do that through the outbound API mock which the operation was invoking.

The reason the failure rate went up was due to the way that NUnit runs tests alphabetically. My new test came “after” the previous one, which never cleaned up the background thread, and so it could fire a refresh again when my test was running. This caused multiple refreshes to be registered instead of zero. The answer was to retain ownership of the background thread and ensure it terminated by the time the test completed.

Does this add complexity to the production code? Yes. But, as is often the case when writing tests properly, it uncovers a number of questions about the expected behaviour that might not be obvious from quickly knocking something up.

Property Based Testing

One of the problems with example based testing is that, by definition, you only run through the scenarios laid out with specific examples. If you miss one, e.g. testing for a leap year, you won’t find the bug. Property based testing takes a slightly different approach that attempts to run through a variety of scenarios made up on-the-fly by a generator.

This might seem to go against the earlier advice, but it’s not because as long as you know the seed for the test generator you can reproduce the scenario exactly. If you can explore the entire problem space on each test run do, but unless it’s a trivial feature that’s unlikely. Hence you use a generator to create inputs that explore a (different) part of it each time. Consequently you still might not unearth that leap year bug in time, but you stand a better chance of finding it other unanticipated problems too.

Wrap-Up

Not every test we write will be as simple and predictable as my little maths equation suggests, but that doesn’t mean we shouldn’t try to get as close to it as possible. Some of the unit tests I once wrote for a message bus connection pool definitely weren’t easy to understand, despite my best efforts to keep them clear, but hopefully they are very much in the minority.

We all make mistakes too, so we might not get it right first time. The unit tests I wrote for logging garbage collections ran fine for over a year before I discovered one of them was leaking slightly and causing another (very intermittent) random failure. But I managed to fix it because I believe a zero tolerance approach to test failures pays dividends in the long run.

Good tests are hard to write, but they should be treated as first class citizens, just like your production code. If they aren’t as simple as “A + B = C” then consider it a test smell and see if there is some unnecessary complexity that can be factored out.

[1] It might be possible to verify it’s behaviour through other means, such as induction, but that’s something I personally know far too little about to use effectively.