Way back in May 2012 I wrote a blog post titled “Beware the Complacency Unit Testing Brings”. This was a reaction to a malaise that I began to see developing as the team appeared to rely more heavily on the feedback it was getting from unit tests. This in turn appeared to cause some “trivial” bugs that should also have been picked up early, to be detected somewhat later.
This post looks at a couple of other examples I’ve seen in the past of problems that couldn’t have be solved by unit testing alone.
Unit Tests Are Self-Reinforcing
Myself and a colleague once had a slightly tortuous conversation with a project manager about our team’s approach to testing. There was a “suggestion” that as the organisation began to make more decisions based on the results of the system we had built, the more costly “a mistake” could become. We didn’t know where this was coming from but the undertone had a suggestion about it of “work harder”.
Our response was that if the business was worried about the potential for losses in millions due to a software bug, then they should have no problem funding a few tens of thousands of pounds of hardware to give us the tools we need to automate more testing. To us, if the risks were high, then the investment should be too, as this helps us to ensure we keep the risks down to a minimum. In essence we advocated working smarter, not harder.
His response was that unit tests should be fast and easy to run, and therefore he questioned why we needed any more hardware. What he failed to understand about unit testing was its self-reinforcing nature [1]. Unit tests are about a programmer verifying that the code they wrote works as they intended it to. What it fails to address is that it meets the demands of the customer. In the case of an API “that customer” is possibly just another developer on the same team providing another piece of the same jigsaw puzzle.
As if to prove a point this scenario was beautifully borne out not long after. Two developers working on either side of the same feature (front-end and back-end) both wrote their parts with a full suite of unit tests and pushed to the DEV environment only to discover it didn’t work. It took a non-trivial amount of time of the three of us (the two devs in question and myself) before I happened to notice that the name of the configuration setting which the front-end and back-end were using was slightly different. Each developer had created their own constant for the setting name, but the constant’s value was different and hence the back-end didn’t believe it was ever being provided.
This kind of integration problem is common. And we’re not talking about junior programmers here either, both were smart and very experienced developers. They were also both TDD-ers and it’s easy to see how this kind of problem occurs when your mind-set is focused around the simplest thing that could possibly work. We always look for the mistake in our most recent changes and both of them created the mismatched constant right back at the very beginning, hence it becomes “out of mind” by the time the problem is investigated [2].
Performance Tests
Unit tests are about verifying functional behaviour, so ensuring performance is not in scope at that point. I got a nice reminder of this not long afterwards when I refactored a stored procedure to remove some duplication, only to send performance through the roof. The SQL technique I used was “slightly” less performant (I later discovered) and it added something like another 100 ms to every call to the procedure.
Whilst all the SQL unit tests passed with flying colours in it’s usual timescale [3], when it was deployed into the test environment, the process it was part of nosedived. The extra 100 ms in the 100,000 calls [4] that the process made to the procedure started to add up and a 30 minute task now took over 8 hours!
Once again I was grateful to have “continuous” deployments to a DEV environment where this showed up right away so that I could easily diagnose and fix it. This just echoes what I wrote about recently in “Poor Performance of log4net Context Properties”.
A Balance
The current backlash against end-to-end testing is well justified as there are more efficient approaches you can take. But we must remember that unit testing is no panacea either. Last year we had these two competing views going head-to-head with each other: Why Most Unit Testing is Waste and Just Say No to More End-to-End Tests. It’s hard to know what to do.
As always the truth probably lies somewhere in between, and shifts either way depending on the kind of product, people and architecture you’re dealing with. The testing pyramid gets trotted out as the modern ideal but personally I’m still not convinced about how steep the sides of it should be for a monolith versus a micro-service, or a thick client versus a web API.
What I do know is that I find value in all different sorts of tests. One size never fits all.
[1] This is one of the things that pair and mob programming tackles because many eyes help make many kinds of mistakes less common.
[2] Yes, I could also go on about better collaboration and working outside in from a failing system test, but this didn’t deserve any massive post mortem.
[3] Database unit tests aren’t exactly speedy anyway so they increased the entire test suite time by an amount of time that could easily have been passed off as noise.
[4] Why was this a sequential operation? Let’s not go there...