Thursday 13 September 2012

Can Code Be Too Simple?

Many years ago I was reflecting on some changes that a newer (albeit experienced) colleague[#] had made and I came across a refactoring that lead to a method becoming reduced to just this one line of C++ code:-

void MemoryTracker::incUsage(const string& typename)
{
    m_objects[typename]++;
}

But this troubled me. My spider-sense was tingling that something was missing - surely it couldn’t be that simple? The main reason for my immediate worries is that this was C++ and I was working on a very old system where I’d spend a fair bit of time fixing a variety of problems caused by the language’s ability to allow you to leave variables uninitialised.

In case you’re still pondering m_objects is declared as a std::map<std::string, size_t>. To add a little more substance to my argument I had been doing a fair bit of VBScript lately and a tiny sideline in C#. In those languages you can’t achieve the same degree of simplicity because they throw an exception when you try and index an item that doesn’t exist. Consequently you’d end up writing this kind of code instead[*]:-

public void IncUsage(string typename)
{
    if (!m_objects.Contains(typename))
        m_objects.Add(typename, 1);
    else
        m_objects[typename]++;
}

So the C++ version is actually doing quite a bit behind your back. Not only that, but it is also doing stuff which you’d normally have to watch out for in your own code.

  1. If the item already exists in the map, return a reference to its value
  2. If the item doesn’t exist, create an entry for it, default construct the value, and then return a reference to the value

The bit that makes me tingle is the “default construct the value” because for primitive types like size_t this is ingrained in my brain as a NOP, as that’s usually what happens unless you go out of your way to ensure it isn’t. Such is the mark of a brain addled by too many years doing C and “Visual” C++. Of course the beauty of containers like std::map is exactly that they “do the right thing” and so I need to let go and just accept that.

Hopefully now you can see why the original code is so simple. When passing a “typename” that doesn’t exist in the map, one gets added. Not only that but the default value for a size_t is 0 and so the post-increment immediately bumps that to 1. Clever, eh?

As I remarked back at the very beginning this was many years ago and with hindsight my response should have been “show me the tests”. A quick read of the tests should have been enough to convince that it worked correctly in the case I was concerned about. But hey, guess what, there were no unit tests at all for the set of shared libraries this code featured in. In fact it was working on this legacy codebase that opened my eyes to the necessity of unit testing.

There are a couple of quotes that usually get trotted out when talking about the beauty and simplicity that code should possess:-

“Make things as simple as possible, but not simpler” -- Albert Einstein

“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away” -- Antoine de Saint-Exupery

I totally agree with the sentiment on all levels - from architecture down to individual statements. And I’ve admitted that backing production code with tests should quash any doubts about code not meeting its requirements. So why do I still feel that it is questionable? There is a fine line between “clever” code (code that does stuff in a concise and yet unobvious way) and “simple” code that does its job using the right tools. I guess that when working in a mixed language/mixed ability environment the “right tools” welded correctly can sometimes seem to be indistinguishable from “clever”.

 

[#] This “chat” was made all the more interesting by their reluctance to accept that there could be anything even worth discussing. Their position was that the code is correct and does exactly what the specification demands. End of.

[*] In C# you could of course wrap this up in an extension method and there may also be a more optimal way. My point is that different containers sometimes have different semantics even within the same language when it comes to “adding” or “setting” a value.

No comments:

Post a Comment