Wednesday 11 July 2018

Support-Friendly Tooling

One of the techniques I briefly mentioned in my last post “Treat All Test Environments Like Production” was how constraining the test environments by adhering to the Principle of Least Privilege drove us to add diagnostic specific features to our services and tools.

In some cases that might be as simple as exposing some existing functionality through an extra command line verb or service endpoint. For example a common technique these days is to add a “version” verb or “–-version” switch to allow you to check which build of a particular tool or service you have deployed [1].

As Bertrand Meyer suggests in his notion of Command/Query Separation (CQS) any behaviour which is a query in nature should have no side-effects and therefore could also be freely available to use for diagnostic purposes – security and performance issues notwithstanding. Naturally these queries would be over-and-above any queries you might run directly against your various data stores, i.e. databases, file-system, etc. using the vendors own lower-level tools.

Where it gets a little more tricky is on the “command” side as we might need to investigate the operation but without disturbing the current state of the system. In an ideal world it should be possible to execute them against a part of the system reserved for such eventualities, e.g. a special customer or schema that looks and acts like a real one but is owned by the team and therefore its side-effects are invisible to any real users. (This is one of the techniques that falls under the in-vogue term of “testing in production”.)

If the issue can be isolated to a particular component then it’s probably more effective to focus on that part of the system by replaying the operation whilst simultaneously redirecting the side-effects somewhere else (or avoiding them altogether) so that the investigation can be safely repeated. One technique here is to host the component in another type of process, such as a GUI or command line tool and provide a variety of widgets or switches to control the input and output locations. Alternatively you could use the Null Object pattern to send the side-effects into oblivion.

In its most simplest form it might be a case of adding a “--ReadOnly” switch that disables all attempts to write to back-end stores (but leaves logging intact if that won’t interfere). This would give you the chance to safely debug the process locally using production inputs. As an aside this idea has been formalised in the PowerShell world via the “-WhatIf” switch which allows you to run a script whilst disabling (where supported) the write actions of any cmdlets.

If the operation requires some form of bulk processing where there is likely to be far too much output for stdout or because you need a little more structure to the data then you can add multiple switches instead, such as the folder to write to and perhaps even a different format to use which is easier to analyse with the usual UNIX command line tools. If implementing a whole different persistence mechanism for support is considered excessive [2] you could just allow, say, an alternative database connection string to be provided for the writing side and point to a local instance.

Earlier I mentioned that the Principle of Least Privilege helped drive out the need for these customisations and that’s because restricting your access affects you in two ways. The first is that by not allowing you to make unintentional changes you cannot make the situation worse simply through your analysis. For example if you happened to be mistaken that a particular operation had no side-effects but it actually does now, then they would be blocked as a matter of security and an error reported. If done in the comfort of a test environment you now know what else you need to “mock out” to be able to execute the operation safely in future. And if the mocking feature somehow gets broken, your lack of privilege has always got your back. This is essentially just the principle of Defence in Depth applied for slightly different reasons.

The second benefit you get is a variation of yet another principle – Design for Testability. To support such features we need to be able to substitute alternative implementations for the real ones, which effectively means we need to “program to an interface, not an implementation”. Of course this will likely already be a by-product of any unit tests we write, but it’s good to know that it serves another purpose outside that use case.

What I’ve described might seem like a lot of work but you don’t have to go the whole hog and provide a separate host for the components and a variety of command-line switches to enable these behaviours, you could probably get away with just tweaking various configuration settings, which is the approach that initially drove my 2011 post “Testing Drives the Need for Flexible Configuration”. What has usually caused me to go the extra step though is the need to use these features more than just once in a blue moon, often to automate their use for longer term use. This is something I covered in much more detail very recently in “Libraries, Console Apps & GUIs”.

 

[1] Version information has been embedded in Windows binaries since the 3.x days back in the ‘90s but accessing it easily usually involved using the GUI shell (i.e. Explorer) unless the machine is remote and has limited access, e.g. the cloud. Luckily PowerShell provides an alternative route here and I’m sure there are plenty of third party command line tools as well.

[2] Do not underestimate how easy it is these days to serialise whole object graphs into JSON files and then process them with tools like JQ.

No comments:

Post a Comment