Monday 30 January 2017

Dumbing Down Code

A conversation with a colleague, which was originally sparked off by “C# BAD PRACTICES: Learn how to make a good code by bad example” (an article about writing clear code), reminded me about a recent change I had just made. When I reflected back I began to consider whether I had just “dumbed down” the code instead of keeping my original possibly simpler version. I also couldn’t decide whether the reason I had changed it was because I was using an old idiom which might now be unfamiliar or because I just didn’t think it was worth putting the burden on the reader.

Going Loopy

The change I made was to add a simple retry loop to some integration test clean-up code that would occasionally fail on a laggy VM. The loop just needed to retry a small piece of code a few times with a brief pause in between each iteration.

My gut instinct was to write the loop like this:

int attempt = 5;

while (attempt-- > 0)
{
  . . .
}

Usually when I need to write a “for” loop these days, I don’t. By far the most common use case is to iterate over a sequence and so I’d use LINQ in C# or, say, a pipeline (or foreach) in PowerShell. If I did need to manually index an array (which is rare) I’d probably go straight for a classic for loop.

After I wrote and tested the code I took a step back just to review what I’d written and realised I felt uncomfortable with the while loop. Yes this was only test code but I don’t treat it as a second class citizen, it still gets the same TLC as production code. So what did I not like?

  • While loops are much rarer than for loops these days.
  • The use of the pre and post-decrement operators in a conditional expression is also uncommon.
  • The loop counts down instead of up.

Of these the middle one – the use of the post-decrement operator in a conditional expression – was the most concerning. Used by themselves (in pre or post form) to just mutate a variable, such as in the final clause of a for loop [1], seems trivial to comprehend, but once you include it in a conditional statement the complexity goes up.

Hence there is no difference between --attempt and attempt-- when used in a simple arithmetic adjustment, but when used in a conditional expression it matters whether the value is decremented before or after the comparison is made. If you get it wrong the loop executes one less time than you expect (or vice-versa) [2].

Which leads me to my real concern – how easy is it to reason about how many times the loop executes? Depending on the placement of the decrement operator it might be one less than the number “attempt” is initialised with at the beginning.

Of course you can also place the “while” comparison at the end of the block which means it would execute at least once, so subconsciously this might also cause you to question the number of iterations, at least momentarily.

Ultimately I know I’ve written the loop so that the magic number “5” which is used to initialise the attempt counter represents the maximum number of iterations, but will a reader trust me to have done the same? I think they’ll err on the side of caution and work it out for themselves.

Alternatives

The solution I went with in the end was this:

const int maxAttempts = 5;

for (int attempt = 0; attempt != maxAttempts;
                                            ++attempt)
{
  . . .
}

Now clearly this is more verbose, but not massively more verbose than my original “while” loop. However the question is whether (to quote Sir Tony Hoare [3]) it obviously contains less bugs that the original. Being au fait with both forms it’s hard for me to decide so I’m trying to guess that the reader would prefer the latter.

Given that we don’t care what the absolute value of “attempt” is, only that we execute the loop N times, I did consider some other approaches, at least momentarily. Both of these examples use methods from the Enumerable class.

The first generates a sequence of 5 numbers starting from the “arbitrary” number 1:

const int maxAttempts = 5;

foreach (var _ in Enumerable.Range(1, maxAttempts))
{
  . . .
}

The second also generates a sequence of 5 numbers but this time by repeating the “arbitrary” number 0:

const int maxAttempts = 5;

foreach (var _ in Enumerable.Repeat(0, maxAttempts))
{
  . . .
}

In both cases I have dealt with the superfluous naming of the loop variable by simply calling it “_”.

I discounted both these ideas purely on the grounds that they’re overkill. It just feels wrong to be bringing so much machinery into play just to execute a loop a fixed number of times. Maybe my brain is addled from too much assembly language programming in my early years but seemingly unnecessary waste is still a hard habit to shake.

As an aside there are plenty of C# extension methods out there which people have written to try and reduce this further so you only need write, say, “5.Times()” or “0.To(5)” but they still feel like syntactic sugar just for the sake of it.

Past Attempts

This is not the first time I’ve questioned whether it’s possible to write code that’s perhaps considered too clever. Way back in 2012 I wrote “Can Code Be Too Simple?” which looked at some C++ code I had encountered 10 years ago and which first got me thinking seriously about the subject.

What separates the audience back then and the one now is the experience level of the programmers who will likely tackle this codebase. A couple of years later in “Will Your Successor Be a Superstar Programmer?” I questioned whether you write code for people of your own calibre or the (inevitable) application support team who have the unenviable task of having to be experts at two disciplines – support and software development. As organisations move towards development teams owning their services this issue is diminishing.

My previous musings were also driven by my perception of the code other people wrote, whereas this time I’m reflecting solely on my own actions. In particular I’m now beginning to wonder if my approach is in fact patronising rather than aiding? Have I gone too far this time and should I give my successors far more credit? Or doesn’t it matter as long as we just don’t write “really weird” code?

 

[1] In C++ iterators can be implemented as simple pointers or complex objects (e.g. STL container iterators in debug builds) and so you tend to be aware of the difference because of the performance impacts it can have.

[2] I was originally going to write while (attempt-- != 0), again because in C++ you normally iterate from “begin” to “!= end”, but many devs seem to be overly defensive and favour using the > and < comparison operators instead.

[3] “There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.” – Sir Tony Hoare.

No comments:

Post a Comment