Friday 25 May 2018

Test the Code, Not the Mock

About 18 months or so ago I wrote a post about how I’d seen tests written that were self-reinforcing (“Tautologies in Tests”). The premise was about the use of the same production code to verify the test outcome as that which was supposedly under test. As such any break in the production code would likely not get picked up because the test behaviour would naturally change too.

It’s also possible to see the opposite kind of effect where the test code really becomes the behaviour under test rather than the production code. The use of mocking within tests is a magnet for this kind of situation as a developer mistakenly believes they can save time [1] by writing a more fully featured mock [2] that can be reused across tests. This is a false economy.

Example - Database Querying

I recently saw an example of this in some database access code. The client code (under test) first configured a filter where it calculated an upper and lower bound based on timestamps, e.g.

// non-trivial time based calculations
var minTime = ...
var maxTime = ...

query.Filter[“MinTime”] = minTime;  
query.Filter[“MaxTime”] = maxTime;

The client code then executed the query and performed some additional processing on the results which were finally returned.

The test fixture created some test data in the form of a simple list with a couple of items, presumably with one that lies inside the filter and another that lies outside, e.g.

var orders = new[]
{
  new Order { ..., Timestamp = “2016-05-12 18:00:00” },
  new Order { ..., Timestamp = “2018-05-17 02:15:00” },
};

The mocked out database read method then implemented a proper filter to apply the various criteria to the list of test data, e.g.

{
  var result = orders;

  if (filter[“MinTime”])
    ...
  if (filter[“MaxTime”])
    ...
  if (filter[...])
    ...

  return result;
}

As you can imagine this starts out quite simple for the first test case but as the production code behaviour gets more complex, so does the mock and the test data. Adding new test data to cater for the new scenarios will likely break the existing tests as they all share a single set and therefore you will need to go back and understand them to ensure the test still exercises the behaviour it used to. Ultimately you’re starting to test whether can actually implement a mock that satisfies all the tests rather than write individual tests which independently validate the expected behaviours.

Shared test data (not just placeholder constants like AnyCustomerId) is rarely a good idea as it’s often not obvious which piece of data is relevant to which test. The moment you start adding comments to annotate the test data you have truly lost sight of the goal. Tests are not just about verifying behaviour either they are a form of documentation too.

Roll Back

If we reconsider the feature under test we can see that there are a few different behaviours that we want to explore:

  • Is the filter correctly formed?
  • Are the query results correctly post-processed?

Luckily the external dependency (i.e. the mock) provides us with a seam which allows us to directly verify the filter configuration and also to control the results which are returned for post-processing. Consequently rather than having one test that tries to do everything, or a few tests that try and cover both aspect together we can separate them out, perhaps even into separate test fixtures based around the different themes, e.g.

public static class reading_orders 
{
  [TestFixture]
  public class filter_configuration    
  ...    
  [TestFixture]
  public class post_processing    
  ...
}

The first test fixture now focuses on the logic used to build the underlying query filter by asserting the filter state when presented to the database. It then returns, say, an empty result set as we wish to ignore what happens later (by invoking as little code as possible to avoid false positives).

The following example attempts to define what “yesterday” means in terms of filtering:

[Test]
public void filter_for_yesterday_is_midnight_to_midnight()
{
  DateTime? minTime = null;
  DateTime? maxTime = null;

  var mockDatabase = CreateMockDatabase((filter) =>
  {
    minTime = filter[“MinTime”];
    maxTime = filter[“MaxTime”];
  });
  var reader = new OrderReader(mockDatabase);
  var now = new DateTime(2001, 2, 3, 9, 32, 47);

  reader.FindYesterdaysOrders(now);

  Assert.That(minTime, Is.EqualTo(
                new DateTime(2001, 2, 2, 0, 0, 0)));
  Assert.That(maxTime, Is.EqualTo(
                new DateTime(2001, 2, 3, 0, 0, 0)));
}

As you can hopefully see the mock in this test is only configured to extract the filter state which we then verify later. The mock configuration is done inside the test to make it clear that the only point of interest is the the filter’s eventual state. We don’t even bother capturing the final output as it’s superfluous to this test.

If we had a number of tests to write which all did the same mock configuration we could extract it into a common [SetUp] method, but only if we’ve already grouped the tests into separate fixtures which all focus on exactly the same underlying behaviour. The Single Responsibility Principle applies to the design of tests as much as it does the production code.

One different approach here might be to use the filter object itself as a seam and sense the calls into that instead. Personally I’m very wary of getting too specific about how an outcome is achieved. Way back in 2011 I wrote “Mock To Test the Outcome, Not the Implementation” which showed where this rabbit hole can lead, i.e. to brittle tests that focus too much on the “how” and not enough on the “what”.

Mock Results

With the filtering side taken care of we’re now in a position to look at the post-processing of the results. Once again we only want code and data that is salient to our test and as long as the post-processing is largely independent of the filtering logic we can pass in any inputs we like and focus on the final output instead:

[Test]
public void upgrade_objects_to_latest_schema_version()
{
  var anyTime = DateTime.Now;
  var mockDatabase = CreateMockDatabase(() =>
  {
    return new[]
    {
      new Order { ..., Version = 1, ... },
      new Order { ..., Version = 2, ... },
    }
  });
  var reader = new OrderReader(mockDatabase);

  var orders = reader.FindYesterdaysOrders(anyTime);

  Assert.That(orders.Count, Is.EqualTo(2));
  Assert.That(orders.Count(o => o.Version == 3),
              Is.EqualTo(2));
}

Our (simplistic) post-processing example here ensures that all re-hydrated objects have been upgraded to the latest schema version. Our test data is specific to verifying that one outcome. If we expect other processing to occur we use different data more suitable to that scenario and only use it in that test. Of course in reality we’ll probably have a set of “builders” that we’ll use across tests to reduce the burden of creating and maintaining test data objects as the data models grow over time.

Refactoring

While reading this post you may have noticed that certain things have been suggested, such as splitting out the tests into separate fixtures. You may have also noticed that I discovered “independence” between the pre and post phases of the method around the dependency being mocked which allows us to simplify our test setup in some cases.

Your reaction to all this may well be to suggest refactoring the method by splitting it into two separate pieces which can then be tested independently. The current method then just becomes a simple composition of the two new pieces. Additionally you might have realised that the simplified test setup probably implies unnecessary coupling between the two pieces of code.

For me those kind of thoughts are the reason why I spend so much effort on trying to write good tests; it’s the essence of Test Driven Design.

 

[1] My ACCU 2017 talk “A Test of Strength” (shorter version) shows my own misguided attempts to optimise the writing of tests.

[2] There is a place for “heavier” mocks (which I still need to write up) but it’s not in unit tests.

Thursday 17 May 2018

It Compiles, Ship It!

The method was pretty simple and a fairly bog standard affair, it just attempted to look something up in a map and return the associated result, e.g.

public string LookupName(string key)
{
  string name;

  if (!customers.TryGetValue(key, out name)
    throw new Exception(“Customer not found”);

  return name;
}

The use of an exception here to signal failure implied to me that this really shouldn’t happen in practice unless the data structure is screwed up or some input validation was missed further upstream. Either way you know (from looking at the implementation) that the outcome of calling the method is either the value you’re after or an exception will be thrown.

So I was more than a little surprised when I saw the implementation of the method suddenly change to this:

public string LookupName(string key)
{
  string name;

  if (!customers.TryGetValue(key, out name)
    return null;

  return name;
}

The method no longer threw an exception on failure it now returned a null string reference.

This wouldn’t be quite so surprising if all the call sites that used this method had also been fixed-up to account for this change in behaviour. In fact what initially piqued my interest wasn’t that this method had changed (although we’ll see in a moment that it could have been expressed better) but how the calling logic would have changed.

Wishful Thinking

I always approach a change from a position of uncertainty. I’m invariably wrong or have something to learn, either from a patterns perspective or a business logic one. Hence my initial assumption was that I now needed to think differently about what happens when I need to “lookup a name” and that lookup fails. Where before it was truly exceptional and should never occur in practice (perhaps indicating a bug somewhere else) it’s now more likely and something to be formally considered, and resolving the failure needs to be handled on a case-by-case basis.

Of course that wasn’t the case at all. The method had been changed to return a null reference because it was now an implementation detail of another new method which didn’t want to use catching an exception for flow control. Instead they now simply check for null and act accordingly.

As none of the original call sites had been changed to handle the new semantics a rich exception thrown early had now been traded for (at best) a NullReferenceException later or (worse case) no error at all and an incorrect result calculated based on bad input data [1].

The TryXxx Pattern

Coming back to reality it’s easy to see that what the author really wanted here was another method that allowed them to attempt a lookup on a name, knowing that in their scenario it could possibly fail but that’s okay because they have a back-up plan. In C# this is a very common pattern that looks like this:

public bool TryLookupName(string key, out string name)

Success or failure is indicated by the return value and the result of the lookup returned via the final argument. (Personally I’ve tended to favour using ref over out for the return value [2].)

The Optional Approach

While statically types languages are great at catching all sorts of type related errors at compile time they cannot catch problems when you smuggle optional reference-type values in languages like C# and Java by using a null reference. Any reference-type value in C# can inherently be null and therefore the compiler is at a loss to help you.

JetBrains’ ReSharper has some useful annotations which you can use to help their static analyser point out mistakes or elide unnecessary checks, but you have to add noisy attributes everywhere. However expressing your intent in code is the goal and it’s one valid and very useful approach.

Winding the clock into the future we have the new “optional reference” feature to look forward to in C# (currently in preview). Rather than bury their heads in the sand the C# designers have worked hard to try and right an old wrong and reduce the impact of Sir Tony Hoare’s billion dollar mistake by making null references type unsafe.

In the meantime, and for those of us working with older C# compilers, we still have the ability to invent our own generic Optional<> type that we can use instead. This is something I’ve been dragging into C# codebases for many years (whilst standing on my soapbox [3]) in an effort to tame at least one aspect of complexity. Using one of these would have changed the signature of the method in question to:

public Optional<string> LookupName(string key)

Now all the call sites would have failed to compile and the author would have been forced to address the effects of their change. (If there had been any tests you would have hoped they would have triggered the alarm too.)

Fix the Design, Not the Compiler

Either of these two approaches allows you to “lean on the compiler” and leverage the power of a statically typed language. This is a useful feature to have but only if it’s put to good use and you know where the limitations are in the language.

While I would like to think that people listen to the compiler I often don’t think they hear it [4]. Too often the compiler is treated as something to be placated, or negotiated with. For example if the Optional<string> approach had been taken the call sites would all have failed to compile. However this calling code:

var name = LookupName(key);

...could easily be “fixed” by simply doing this to silence the compiler:

var name = LookupName(key).Value;

For my own Optional<> type we’d just have switched from a possible NullReferenceException on lookup failure to an InvalidOperationException. Granted this is better as we have at least avoided the chance of the null reference silently making its way further down the path but it doesn’t feel like we’ve addressed the underlying problem (if indeed there has even been a change in the way we should treat lookup failures).

Embracing Change

While the Optional<> approach is perhaps more composable the TryXxx pattern is more invasive and that probably has value in itself. Changing the signature and breaking compilation is supposed to put a speed bump in your way so that you consider the effects of your potential actions. In this sense the more invasive the workaround the more you are challenged to solve the underlying tension with the design.

At least that’s the way I like to think about it but I’m afraid I’m probably just being naïve. The reality, I suspect, is that anyone who could make such a change as switching an exception for a null reference is more concerned with getting their change completed rather than stopping to ponder the wider effects of what any compiler might be trying to tell them.

 

[1] See Postel’s Law and  consider how well that worked out for HTML.

[2] See “Out vs Ref For TryXxx Style Methods”.

[3] C# already has a “Nullable” type for optional values so I find it odd that C# developers find the equivalent type for reference-type values so peculiar. Yes it’s not integrated into the language but I find it’s usually a disconnect at the conceptual level, not a syntactic one.

[4] A passing nod to the conversation between Woody Harrelson and Wesley Snipes discussing Jimi Hendrix in White Men Can’t Jump.

Friday 11 May 2018

The Perils of DateTime.Parse()

The error message was somewhat flummoxing, largely because it was so generic, but also because the data all came from a database extract rather than manual input:

Input string was not in a correct format.

Naturally I looked carefully at all the various decimal and date values as I knew this was the kind of message you get when parsing those kind of values when they’re incorrectly formed, but none of them appeared to be at fault. The DateTime error message is actually slightly different [1] but I’d forgotten that at the time and so I eyeballed the dates as well as decimal values just in case.

Then I remembered that empty string values also caused this error, but lo-and-behold I was not missing any optional decimals or dates in my table either. Time to hit the debugger and see what was going on here [2].

The Plot Thickens

I changed the settings for the FormatException error type to break on throw, sent in my data to the service, and waited for it to trip. It didn’t take long before the debugger fired into life and I could see that the code was trying to parse a decimal value as a double but the string value was “0100/04/01”, i.e. the 1st April in the year 100. WTF!

I immediately went back to my table and checked my data again, aware that a date like this would have stood out a mile first time around, but I was happy to assume that I could have missed it. This time I used some regular expressions just to be sure my eyes were not deceiving me.

The thing was I knew what column the parser thought the value was in but I didn’t entirely trust that I hadn’t mucked up the file structure and added or removed an errant comma in the CSV input file. I didn’t appear to have done that and so the value that appeared to be causing this problem was the decimal number “100.04”, but how?

None of this made any sense and so I decided to debug the client code, right from reading in the CSV data file through to sending it across the wire to the service, to see what was happening. The service was invoked via a fairly simple WCF client assembly and as I stepped into that code I came across a method called NormaliseDate()...

The Mist Clears

What this method did was to attempt to parse the input string value as a date and if it was successful it would rewrite it in an unusual (to me) “universal” format – YYYY/MM/DD [3].

The first two parsing attempts it did were very specific, i.e. it used DateTime.ParseExact() to match the intended output format and the “sane” local time format of DD/MM/YYYY. So far, so good.

However the third and last attempt, for whatever reason, just used DateTime.Parse() in its no-frills form and that was happy to take a decimal number like “100.04” and treat it as a date in the format YYY.MM! At first I wondered if it was treating it as a serial or OLE date of some kind but I think it’s just more liberal in its choice of separators than the author of our method intended [4].

Naturally there are no unit tests for this code or any type of regression test suite that shows what kind of scenarios this method was intended to support. Due to lack of knowledge around deployment and use in the wild of the client library I was forced to pad the values in the input file with trailing zeroes in the short term to workaround the issue, yuck! [5]

JSON Parsers

This isn’t the first time I’ve had a run-in with a date parser. When I was working on REST APIs I always got frustrated by how permissive the JSON parser would be in attempting to coerce a string value into a date (and time). All we ever wanted was to keep it simple and only allow ISO-8601 format timestamps in UTC unless there was a genuine need to support other formats.

Every time I started writing the acceptance tests though for timestamp validation I’d find that I could never quite configure the JSON parser to reject everything but the desired format. In the earlier days of my time with ASP.Net even getting it to stop accepting local times was a struggle and even caused us a problem as we discovered a US/UK date format confusion error which the parser was hiding from us.

In the end we resorted to creating our own Iso8601DateTime type which used the .Net DateTimeOffest type under the covers but effectively allowed us to use our own custom JSON serializer methods to only support the exact format we wanted.

More recently JSON.Net has gotten better at letting you control the format and parsing of dates but it’s still not perfect and there are unit tests in past codebases that show variants that would unexpectedly pass, despite using the strictest settings. I wouldn’t be surprised if our Iso8601DateTime type was still in use as I can only assume everyone else is far less pedantic about the validation of datetimes and those that are have taken a similar route to ensure they control parsing.

A Dangerous Game

One should not lose sight though of the real issue here which the attempt to classify string values by attempting to parse them. Even if you limit yourself to a single locale you might get away with it but when you try and do that across arbitrary locales you’re just asking for trouble.

 

[1] “String was not recognized as a valid DateTime.

[2] This whole fiasco falls squarely in the territory I’ve covered before in my Overload article “Terse Exception Messages”. Fixing this went to the top of my backlog, especially after I discovered it was a problem for our users too.

[3] Why they didn’t just pick THE universal format of ISO-8601 is anyone’s guess.

[4] I still need to go back and read the documentation for this method because it clearly caters for scenarios I just don’t normally see in my normal locale or user base.

[5] That’s what happens with tactical solutions, no one ever quite gets around to documenting anything because they never think it’ll survive for very long...