Wednesday, 26 November 2014

My 200th Blog Post

Five and a half years ago during a 6 month sabbatical I decided to start a blog. I did a brief retrospective after a year but I’ve not done another one since. Last week I was curious enough to tot up how many posts I’d written so far, and given that it was close to a nice round number, I thought it was the perfect opportunity to have another retrospective and see how things have panned out since then.

Looking back at that first anniversary blog post I can see that not that much has changed in terms of my mission on what I wanted to write about, except perhaps that I’m continuing to live in the C# world and only dabbling in C++ in my personal codebase. I have added some T-SQL projects and PowerShell scripts in the meantime which have also given me a wider remit. In fact my PowerShell blog posts are the ones most read according to Google Analytics.

I’m still mostly using my blog as a knowledge base where I can document stuff I’ve not obviously found an answer for elsewhere, but more recently I’ve branched out and started to write posts that cover my journey as a programmer too. Many of them are my various fails, such as “Friends Don’t Let Friends Use Varargs” and “A Not-So-Clever-Now String Implementation”, which is my way of showing that we all make mistakes and they should be nothing to be ashamed of. Those particular examples covered more technical aspects of programming whereas I’ve also started to look at the development process itself, such as in “What’s the Price of Confidence” and “The Courage to Question”. Refactoring has been a heavily reoccurring topic too, as of late. On a few occasions I’ve even dared to air a few more programming-related personal issues to try and give a little more background to what goes on inside my head, see “Haunted By Past Mistakes” for example. I don’t suppose anyone is really interested, but it acts as a nice form of therapy for me.

What was amusing to read back in that first anniversary post was the section about how I was planning to write something about unit test naming conventions, but I ended up ditching it because I discovered I was way off the mark. That post eventually did get written, but only last month! As you’ll see in “Unit Testing Evolution Part II – Naming Conventions” and the follow-up “Other Test Naming Conventions” this part of my programming journey has been quite bumpy. I’m not entirely sure I’ve reached a plateau yet there either.

My main aspiration was that writing this blog would help me sharpen my writing skills and give me the confidence to go and write something more detailed that might then be formally published. Given my membership of the ACCU I naturally had my eye both of its journals (C Vu and Overload) as an outlet. It took me a few years to get to that position (I had at least written upwards of 20 reviews of books, the conference and branch meetings) but eventually I resurrected an article I had started just over 3 years earlier and finally finished it off.

That article was accepted for the February 2013 issue of Overload. Since then I’ve started an occasional column in C Vu called “In the Toolbox” and have written many other articles for both C Vu and Overload on both technical and non-technical matters. A few of those even started out life as a blog post which I then cleaned up and embellished further. A few months back I felt I had reached the point where I had enough content to require a couple of blog posts to summarise what I’d published so far: “In The Toolbox - Season One”, “Overload Articles - Season One” and “C Vu Articles - Season One”.

Confidence in writing has undoubtedly led to a confidence in presenting too, by which I mean that at least I feel more confident about the content I’m presenting, even if my presentation style is still rather rough around the edges.

The last thing I noted in my first anniversary post was a quip about not looking at the stats from Google CropperCapture[5]Analytics. Well, if the stats can be believed, and I’m not altogether convinced how accurate they are [1], then I’m definitely well into double figures now. In fact quite recently the number of page views rocketed to the tens-of-thousands per month. This appears to coincide with a blog post I published about my dislike of the term Dependency Injection (See “Terminology Overdose”) and it got a much wider audience than normal via some re-tweets from a couple of prominent programmers. The graph on the right shows the number of page views per month over the time my blogs been active. Obviously a page view does not constitute a “read” but if nothing else it shows that my content is not stuck at the bottom of the search engine results pages. Hopefully some of those “hits” will have resulted in someone finding something useful, which is all I really aspire to.

I decided to use WC on my collection of blog posts, which I get emailed to me whenever I publish them, to see how many words I’ve written in the last five and a half years. The original result seemed way too high as I know that even my longest posts still come in under 1,500 words. In turned out Blogger started sending me a multi-part email with both a plain text and an HTML version in it some months after I’d started. A quick application of SED with an address range to pick out just the plain text versions gave me the more believable sum of ~148,000 words. Based on that, each post is ~740 words long which sounds about right. I had hoped to write more, smaller posts, but I seem incapable of doing that - as now demonstrated…

So, what does the future hold? Well, I still have 77 blog post titles sat in the backlog, so I reckon there is still plenty of content I probably want to explore and I’m sure the backlog is grower quicker than I’m writing which feels healthy for a budding author. Moving teams and projects every so often always seems to generate new thoughts and questions and therefore I expect more to come from that. The one concern I did have was that starting to write more formal articles would mean I spent less time blogging, but I don’t think that’s been the case. My blog will definitely continue to be the place for ad-hoc content and for musings that I have yet to form into a more coherent piece.

See you again in a few years.

 

[1] The Google stats indicate a steady stream of page views from May 2006, but my blog wasn’t even started until April 2009!

Tuesday, 25 November 2014

Derived Class Static Constructor Not Invoked

The other day I got a lesson in C# static constructors (aka type constructors) and the order in which they’re invoked in a class hierarchy that also has generic types thrown in for good measure. Naturally I was expecting one thing but something different happened instead. After a bit of thought and a quick read I can see why it’s so much simpler and behaves the way it does, but I still felt it was a little surprising given the use cases of static constructors. Part of this, I suspect, is down to my C++ roots where everything is passed by value by default and templates are compile-time, which is the opposite of C# where most types are passed by reference and generics are generated at run-time.

The Base Class

We have a base class that is used as the basis for any custom value types we create. It’s a generic base class that is specialised with the underlying “primitive” value type and the concrete derived “value” type:

public abstract class ValueObject<TDerived, TValue>
{
  // The underlying primitive value.
  public TValue Value { get; protected set; }

  // Create a value from a known “safe” value.
  public static TDerived FromSafeValue(TValue value)
  {
    return Creator(value);
  }

  // The factory method used to create the derived
  // value object.
  protected static Func<TValue, TDerived> Creator
      { private get; set }
  . . .
}

As you can see there is a static method called FromSafeValue() that allows the value object to be created from a trusted source of data, e.g. from a database. The existing implementation relied on reflection as a means of getting the underlying value into the base property (Value) and I wanted to see if I could do it without using reflection by getting the derived class to register a factory method with the base class during its static constructor (i.e. during “type” construction).

The Derived Class

The derived class is incredibly simple as all the equality comparisons, XML/JSON serialization, etc. is handled by the Value Object base class. All it needs to provide is a method to validate the underlying primitive value (e.g. a string, integer, date/time, etc.) and a factory method to create an instance after validation has occurred. The derived class also has a private default constructor to ensure you go through the public static methods in the base class, i.e. Parse() and FromSafeValue().

public class PhoneNumber
               : ValueObject<PhoneNumber, string>
{
  // ...Validation elided.
 
  private PhoneNumber
  { }

  static PhoneNumber()
  {
    Creator = v => new PhoneNumber { Value = v };
  }
}

The Unit Test

So far, so good. The first unit test I ran after my little refactoring was the one for ensuring the value could be created from a safe primitive value:

[Test]
public void value_can_be_created_from_a_safe_value()
{
  const string safeValue = “+44 1234 123456”;

  Assert.That(
    PhoneNumber.FromSafeValue(safeValue).Value,
    Is.EqualTo(safeValue));
}

I ran the test... and I got a NullPointerException [1]. So I set two breakpoints, one in the derived class static constructor and another in the FromSafeValue() method and watched in amazement as the method was called, but the static constructor wasn’t. It wasn’t even called at any point later.

I knew not to expect the static constructor to be called until the type was first used. However I assumed it would be when the unit test method was JIT compiled because the type name is referenced in the test and that was the first “visible” use. Consequently I was a little surprised when it wasn’t called at that point.

As I slowly stepped into the code I realised that the unit test doesn’t really reference the derived type - instead of invoking PhoneNumber.FromSafeValue() it’s actually invoking ValueObject.FromSafeValue(). But this method returns a TDerived (i.e. a PhoneNumber in the unit test example) so the type must be be needed by then, right? And if the type’s referenced then the type’s static constructor must have been called, right?

The Static Constructor Rules

I’ve only got the 3rd edition of the The C# Programming Language, but it covers generics which I thought would be important (it’s not). The rules governing the invocation of a static constructor are actually very simple - a static ctor is called once, and only once, before either of these scenarios occurs:

  • An instance of the type is created
  • A static method on the type is called

The first case definitely does not apply as it’s the factory function we are trying to register in the first place that creates the instances. The second case doesn’t apply either because the static method we are invoking is on the base class, not the derived class. Consequently there is no reason to expect the derived type’s static constructor to be called at all in my situation. It also appears that the addition of Generics hasn’t changed the rules either, except (I presume) for a tightening of the wording to use the term “closed type”.

Putting my C++ hat back on for a moment I shouldn’t really have been surprised by this. References in C# are essentially similar beasts to references and pointers in C++, and when you pass values via them in C++ you don’t need a complete class definition, you can get away with just a forward declaration [2]. This makes sense as the size of an object reference (or raw pointer) is fixed, it’s not dependent on the type of object being pointed to.

Forcing a Static Constructor to Run

This puts me in a position involving chickens and eggs. I need the derived type’s static constructor to run so that it can register a factory method with the base class, but the constructor won’t need to be invoked until the factory method is invoked to create an instance of the class. The whole point of the base class is to reduce the burden on the derived class; forcing the client to use the type directly, just so it can register the factory method, defeats the purpose of making the base class factory methods public.

A spot of googling revealed a way to force the static constructor for a type to run. Because our base class takes the derived type as one of its generic type parameters we know what it is [3] and can therefore access its metadata. The solution I opted for was to add a static constructor to the base class that forces the derived class static constructor to run.

static ValueObject()

  System.Runtime.CompilerServices.RuntimeHelpers
    .RunClassConstructor(typeof(TDerived).TypeHandle);
}

Now when the unit test runs it references the base class (via the static FromSafeValue() method) so the base class static constructor gets invoked. This therefore causes the derived class static constructor to be run manually, which then registers the derived class factory method with the base class. Now when the FromSafeValue() static method executes it accesses the Creator property and finds the callback method registered and available for it to use.

This feels like a rather dirty hack and I wonder if there is a cleaner way to ensure the derived class’s static constructor is called in a timely manner without adding a pass-through method on the derived type. The only other solutions I can think of all involve using reflection to invoke the methods directly which is pretty much back to where I started out, albeit with different take on what would be called via reflection.

 

[1] Actually I got an assertion failure because I had asserted that the Creator property was set before using it, but the outcome was essentially the same - the test failed due to the factory method not being registered.

[2] This is just the name of the class (or struct) within its namespace, e.g. namespace X { class Z; }

[3] This technique of passing the derived type to its generic base class is known as the Curiously Recurring Template Pattern (CRTP) in C++ circles but I’ve not heard it called CRGP (Curiously Recurring Generic Pattern) in the C# world.

Thursday, 20 November 2014

My Favourite Quotes

There are a number of quotes, both directly and indirectly related to programming, that I seem to trot out with alarming regularity. I guess that I always feel the need to somehow “show my workings” whenever I’m in a discussion about a program’s design or style as a means of backing up my argument. It feels as though my argument is not worth much by itself, but if I can pull out a quote from one of the “Big Guns” then maybe my argument will carry a bit more weight. I suppose that the quotes below are the ones foremost in my mind when writing software and therefore I suppose could be considered as my guiding principles.

The one I’ve probably quoted most is a pearl from John Carmack. It seems that since switching from C++ to C# the relevance of this quote has grown in orders of magnitude as I struggle with codebases that are massively overcomplicated through the excessive use of design patterns and Silver Bullet frameworks. The lack of free functions and a common fear of the “static” keyword (See “Static - A Force for Good and Evil”) only adds to the bloat of classes.

“Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function.”

Along similar lines, and also recently given quite an airing for the same reasons is one from Brian Kernighan. I’m quite happy to admit that I fear complexity. In my recent post “AOP: Attributes vs Functional Composition” I made it quite clear that I’d prefer to choose something I understood and therefore could control over a large black box that did not feel easily to comprehend, even if it cost a little more to use. This is not a case of the “Not Invented Here” syndrome, but one of feeling in control of the technology choices. Entropy will always win in the end, but I’ll try hard to keep it at bay.

“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

It’s somewhat overused these days but the whole “teach a man to fish” aphorism is definitely one I try and adhere to. However I prefer a slightly different variation that emphasises a more team-wide approach to the notion of advancement. Instead of a teaching a single person to fish, I’d hope that by setting a good example I’m teaching the entire team to fish instead.

“a rising tide lifts all boats”

I used the next quote in my The Art of Code semi-lightning talk which I gave at the ACCU 2014 conference. The ability to read and reason about code is hugely important if we are ever going to stand a chance of maintaining it in the future. The book this quote by Hal Abelson comes from (Structure and Interpretation of Computer Programs) may be very old but it lays the groundwork for much of what we know about writing good code even today.

“Programs must be written for people to read, and only incidentally for machines to execute.”

The obvious one to pick from Sir Tony Hoare would be the classic “premature optimisation” quote, but I think most people are aware of that, even if they’re not entirely cognisant of the context surrounding its birth. No, my favourite quote from him is once again around the topic of writing simple code and I think that it follows on nicely from the previous one as it suggests what the hopeful outcome is of writing clearly readable code.

“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”

By now the trend should be fairly obvious, but I have another simplicity related quote that I think applies to the converse side of the complexity problem - refactoring legacy code. Whilst the earlier quotes seem to me to be most appropriate when writing new code, this one from Antoine de Saint-Exupery feels like the perfect reminder on how to remove that accidental complexity that ended up in the solution whilst moving “from red to green”.

“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.”

Wednesday, 19 November 2014

TFS Commit Triggering Wrong Gated Build

This post is very much of the “weird stuff I’ve run across” variety, so don’t expect any in depth analysis and solution. I’m just documenting it purely in case someone else runs into a similar problem and then they won’t feel like they’re going bonkers either...

The other evening I wasted the best part of a couple of hours trying to track down why my TFS gated build was failing when pushing even the simplest change. I had started with a much bigger commit but in the process of thinking I was going mad I shelved my changes and tried pushing a “whitespace only” change that I knew couldn’t possibly be the cause of the unit test failures the gated build was experiencing.

What I eventually noticed was that the codebase had some commits from someone else just prior to me but I couldn’t see their gated builds in the TFS online builds page for the project. This suggested that either they were pushing changes and bypassing the gated build, or something funky was going on.

I finally worked out what was causing the build to fail (it was in their commit) and so I fixed that first. Then in the morning I started the investigation as to why my colleague’s commits did not appear to be going through the gated build. He was quite sure that he was being forced to go through the gated build process and even showed me “his builds” in the Visual Studio TFS builds page to prove it. But as we looked a little more closely together I noticed that the builds listed were for an entirely different TFS project!

Our TFS repo has a number of different projects, each for a different service, that have their own Main branch, Visual Studio .sln and gated build process [1]. Essentially he was committing changes in one part of the source tree (i.e. one project) but TFS was then kicking off a gated build for a solution in a different part of the source tree (i.e. an entirely different project). And because the project that was being built was stable, every build was just rebuilding the same code again and again and therefore succeeding. TFS was then perfectly happy to merge whatever shelveset was attached to the gated build because it knew the build had succeeded, despite the fact that the build and merge operated on different TFS projects!

My gut reaction was that it was probably something to do with workspace mappings. He had a single workspace mapping right at the TFS root, whereas I was using git-tfs and another colleague had mapped their workspace at the project level. He removed the files in the workspace (but did not delete the mapping itself) [2], then fetched the latest code again and all was good. Hence it appears that something was cached locally somewhere in the workspace that was causing this to happen.

As I was writing this up and reflecting on the history of this project I realised that it was born from a copy-and-paste of an existing project - the very project that was being built by accident. Basically the service was split into two, and that operation was done by the person whose commits were causing the problem, which all makes sense now.

What I find slightly worrying about this whole issue is that essentially you can circumvent the gated build process by doing something client-side. Whilst I hope that no one in my team would ever consider resorting to such Machiavellian tactics just to get their code integrated it does raise some interesting questions about the architecture of TFS and/or the way we’ve got it configured.

 

[1] That’s not wholly true, in case it matters. Each separately deployable component has its own TFS project and gated build, but there are also projects at the same level in the tree that do not have a “Main” branch or a gated build at all. I think most projects also share the same XAML build definition too, with only the path to the source code differing.

[2] To quote Ripley from Aliens: “I say we take off and nuke the entire site from orbit. It's the only way to be sure.”

Tuesday, 18 November 2014

Don’t Pass Factories, Pass Workers

A common malaise in the world of software codebases is the over reliance on the use of factories. I came across some code recently that looked something like this:

public class OrderController
{
  public OrderController(IProductFinder products, 
           ICustomerFinder customers, 
           IEventPublisherFactory publisherFactory)
  { . . . }

  public void CreateOrder(Order order)
  {
    var products = _products.Find(. . .);
    var customer = _customers.Find(. . .);
    . . .
    PublishCreateOrderEvent(. . .);
  }

  private void PublishCreateOrderEvent(. . .)
  {
    var event = new Event(. . .);
    . . .
    var publisher = _publisherFactory.GetPublisher();
    publisher.Publish(event);
  }
}

Excessive Coupling

This bit of code highlights the coupling created by this approach. The “business logic” is now burdened with knowing about more types than it really needs to. It has a dependency on the factory, the worker produced by the factory and the types used by the worker to achieve its job. In contrast look at the other two abstractions we have passed in - the one to find the products and other to find the customer. In both these cases we provided the logic with a minimal type that can do exactly the job the logic needs it to - namely to look up data. Behind the scenes the concrete implementations of these two might be looking up something in a local dictionary or making a request across the intranet, we care not either way [1].

Refactoring to a Worker

All the logic really needs is something it can call on to send an event on its behalf. It doesn’t want to be burdened with how that happens, only that it does happen. It also doesn’t care how the resources that it needs to achieve it are managed, only that they are. What the logic needs is another worker, not a factory to give it a worker that it then has to manage as well as all it’s other responsibilities.

The first refactoring I would do would be to create an abstraction that encompasses the sending of the event, and then pass that to the logic instead. Essentially the PublishCreateOrderEvent() should be lifted out into a separate class, and an interface passed to provide access to that behaviour:

public interface IOrderEventPublisher
{
  PublishCreateEvent(. . .);
}

internal class MsgQueueOrderEventPublisher :
                             IOrderEventPublisher
{
  public MsgQueueOrderEventPublisher(
           IEventPublisherFactory publisherFactory)
  { . . . }

  public void PublishCreateEvent(. . .)
  {
    // Implementation moved from
    // OrderController.PublishCreateOrderEvent()
    . . .
  }
}

This simplifies the class with the business logic to just this:

public class OrderController
{
  public OrderController(IProductFinder products, 
           ICustomerFinder customers, 
           IOrderEventPublisher publisher)
  { . . . }

  public void CreateOrder(Order order)
  {
    var products = _products.Find(. . .);
    var customer = _customers.Find(. . .);
    . . .
    _publisher.PublishCreateEvent(. . .);
  }
}

This is a very simple change, but in the process we have reduced the coupling between the business logic and the event publisher quite significantly. The logic no longer knows anything about any factory - only a worker that can perform its bidding - and it doesn’t know anything about what it takes to perform the action either. And we haven’t “robbed Peter to pay Paul” by pushing the responsibility to the publisher, we have put something in between the logic and publisher instead. It’s probably a zero-sum game in terms of the lines of code required to perform the action, but that’s not where the win is, at least not initially.

The call on the factory to GetPublisher() is kind of a violation of the Tell, Don’t Ask principle. Instead of asking (the factory) for an object we can use to publish messages, and then building the messages ourselves, it is preferable to tell someone else (the worker) to do it for us.

Easier Testing/Mocking

A natural by-product of this refactoring is that the logic is then easier to unit test because we have reduced the number of dependencies. Doing that reduces the burden we place on having to mock those dependencies to get the code under test. Hence before our refactoring we would have had to mock the factory, the resultant publisher and any other “complex” dependent types used by it. Now we only need to mock the worker itself.

My suspicion is that getting into this situation in the first place can often be the result of not doing any testing, only doing high-level testing (e.g. acceptance tests/system tests) or doing test-later unit testing; i.e. the production code is always written first. This is where using a test-first approach to low-level testing (e.g. TDD via unit tests) should have driven out the simplified design in the first place instead of needing to refactor it after-the-fact.

One of the benefits of doing a lot of refactoring on code (See “Relentless Refactoring”) is that over time you begin to see these patterns without even needing to write the tests first. Eventually the pain of getting code like this under test accumulates and the natural outcomes start to stick and become design patterns. At this point your first order approximation would likely be to never directly pass a factory to a collaborator, you would always look to raise the level of abstraction and keep the consumer’s logic simple.

That doesn’t mean you never need to pass around a “classic” factory type, but in my experience they should be very far and few between [2].

Use Before Reuse

Once we have hidden the use of the factory away behind a much nicer abstraction we can tackle the second smell - do we even need the factory in the first place? Presumably the reason we have the factory is because we need to “acquire” a more low-level publisher object that talks to the underlying 3rd party API. But do we need to manufacturer that object on-the-fly or could we just create one up front and pass it to our new abstraction directly via the constructor? Obviously the answer is “it depends”, but if you can design away the factory, or at least keep it outside “at the edges” you may find it provides very little value in the end and can get swallowed up as an implementation detail of it’s sole user.

The reason I suggest that factories are a smell is because they hint at premature thoughts about code reuse. Rather than just creating the abstraction we need at the time and then refactoring to generalise when the need arises, if the need ever arises, we immediately start thinking about how to generalise it up front. A factory type always gives the allure of a nice simple pluggable component, which it can be for connection/resource pools, but it has a habit of breeding other factories as an attempt it made to generalise it further and further.

One of the best articles on premature reuse is “Simplicity before generality, use before reuse” by Kevlin Henney in “97 Things Every Software Architect Should Know”.

 

[1] The Network is Evil is a good phrase to keep in mind as hiding a network request is not a good idea, per-se. The point here is that the interface is simple, irrespective of whether we make the call synchronously or asynchronously.

[2] The way Generics work in C# means you can’t create objects as easily as you can with templates in C++ and so passing around delegates, which are essentially a form of anonymous factory method, is not so uncommon. They could just be to support unit testing to allow you to mock what might otherwise be in an internal factory method.

Saturday, 15 November 2014

The Hardware Cycle

I’m writing this on a Windows XP netbook that is now well out of extended life support as far as Microsoft is concerned. I’m still happily using Microsoft Office 2002 to read and send email and the same version of Word to write my articles. This blog post is being written with Windows Live Writer which is considerably newer (2009) but still more than adequately does the job. At Agile on the Beach 2014 back in September, I give my Test-Driven SQL talk and coding demo on my 7 year old Dell laptop, which also runs XP (and SQL Server Express 2008). But all that technology is pretty new compared to what else we own.

I live in a house that is well over 150 years old. My turntable, CD player, amplifier, speakers etc. all date from the late 80’s and early 90’s. My first wide-screen TV, a 30 inch Sony beast from the mid 90’s is still going strong as the output for our various games consoles - GameCube, Dreamcast, Atari 2600, etc. Even our car, a 7-seater MPV that lugs a family of 6 around constantly, has just had its 8th birthday and that actually has parts that suffer from real wear-and-tear. Our last washing machine and tumble dryer, two other devices which are in constant use in a big family, also lasted over 10 years before recently giving up the ghost.

Of course we do own newer things; we have to. My wife has a laptop for her work which runs Windows 8 and the kids watch Netflix on the iPad and Virgin Media Tivo box. And yet I know that all those things will still be around for some considerable time, unlike our mobile phones. Yes, I do still have a Nokia 6020i in my possession for those occasional camping trips where electricity is scarce and a battery that lasts 10 days on standby is most welcome. No, it’s the smart phones which just don’t seem to last, we appear to have acquired a drawer full of phones (and incompatible chargers).

My own HTC Desire S is now just over 3 years old. It was fine for the first year or so but slowly, over time, each app update sucks more storage space so that you have to start removing other apps to make room for the ones you want to keep running. And the apps you do keep running get more and more sluggish over time as the new features, presumably aimed at the newer phones, cause the device to grind. Support for the phone’s OS only seemed to last 2 years at the most (Android 2.3.5). My eldest daughter’s HTC Wildfire which is of a similar age is all but useless now.

As a professional programmer I feel obligated to be using the latest kit and tools, and yet as I get older everywhere I look I just see more Silver Bullets and realise that the biggest barrier to me delivering is not the tools or the technology, but the people - it’s not knowing “how” to build, but knowing “what” to build that’s hard. For the first 5 years or so of my career I knew what each new Intel CPU offered, what sort of RAM was best, and then came the jaw-dropping 3D video cards. As a gamer Tom’s Hardware Guide was a prominent influence on my life.

Now that I’m married with 4 children my priorities have naturally changed. I am beginning to become more aware of the sustainability issues around the constant need to upgrade software and hardware, and the general declining attitude towards “fixing things” that modern society has. Sadly WD40 and Gaffer tape cannot fix most things today and somehow it’s become cheaper to dump stuff and buy a new one than to fix the old one. The second-hand dishwasher we inherited a few years back started leaking recently and the callout charge alone, just to see if it might be a trivial problem or a lost cause, was more than buying a new one.

In contrast though the likes of YouTube and 3D printers has put some of this power back into the hands of consumers. A few years back I came home from work to find my wife’s head buried in the oven. The cooker was broken and my wife found a video on YouTube for how to replace the heating element for the exact model we had. So she decided to buy the spare part and do it herself. It took her a little longer than the expert in the 10 minute video, but she was beaming with a real sense of accomplishment at having fixed it herself, and we saved quite a few quid in the process.

I consider this aversion to filling up landfills or dumping electronics on poorer countries one of the better attributes I inherited from my father [1]. He was well ahead of the game when it came to recycling; as I suspect many of that generation who lived through the Second World War are. We still have brown paper bags he used to keep seeds in that date back to the 1970’s, where each year he would write the yield for the crop (he owned an allotment) and then reuse the same bags the following year. The scrap paper we scribbled on as children was waste paper from his office. The front wall of the house where he grew up as a child was built by him using spoiled bricks that he collected from a builders merchant on his way home from school. I’m not in that league, but I certainly try hard to question our family’s behaviour and try to minimise any waste.

I’m sure some economist out there would no doubt point out that keeping an old house, car, electronics, etc. is actually worse for the environment because they are less efficient than the newer models. When was the last time you went a week before charging your mobile phone? For me there is an air of irrelevance to the argument about the overall carbon footprint of these approaches, it’s more about the general attitude of being in such a disposable society. I’m sure one day mobile phone technology will plateau, just as the desktop PC has [2], but I doubt I’ll be going 7 days between charges ever again.

 

[1] See “So Long and Thanks For All the Onions”.

[2] Our current home PC, which was originally bought to replace a 7 year old monster that wasn’t up to playing Supreme Commander, is still going strong 6 years later. It has 4 cores, 4 GB RAM, a decent 3D video card and runs 32-bit Vista. It still suffices for all the games the older kids can throw at it, which is pretty much the latest fad on Steam. The younger ones are more interested in Minecraft or the iPad.

Thursday, 13 November 2014

Minor Technology Choices

There is an interesting development on my current project involving a minor technology choice that I’m keen to see play out because it wouldn’t be my preferred option. What makes it particularly interesting is that the team is staffed mostly by freelancers and so training is not in scope per-se for those not already familiar with the choice. Will it be embraced by all, by some, only supported by its chooser, or left to rot?

Past Experience

We are creating a web API using C# and ASP.Net MVC 4, which will be my 4th web API in about 18 months [1]. For 2 of the previous 3 projects we created a demo site as a way of showing our stakeholders how we were spending their money, and to act as a way of exploring the API in our client’s shoes to drive out new requirements. These were very simple web sites, just some basic, form-style pages that allowed you to explore the RESTful API without having to manually crank REST calls in a browser extension (e.g. Advanced Rest Client, Postman, etc.).

Naturally this is because the project stakeholders, unlike us, were not developers. In fact they were often middle managers and so clearly had no desire to learn about manually crafting HTTP requests and staring at raw JSON responses - it was the behaviour (i.e. the “journey”) they were interested in. Initially the first demo site was built client-side using a large dollop of JavaScript, but we ran into problems [2], and so another team member put together a simple Razor (ASP.Net) based web site that suited us better. This was then adopted on the next major project and would have been the default choice for me based purely on familiarity.

Back to JavaScript

This time around we appear to be going back to JavaScript with the ASP.Net demo service only really acting as a server for the static JavaScript content. The reasoning, which I actually think is pretty sound, is that it allows us to drive the deliverable (the web API) from the demo site itself, instead of via a C# proxy hosted by the demo site [3]. By using client-side AJAX calls and tools like Fiddler we can even use it directly as a debugging tool which means what we’ve built is really a custom REST client with a more business friendly UI. This all sounds eminently sensible.

Skills Gap

My main concern, and I had voiced this internally already, is that as a team our core skills are in C# development, not client-side JavaScript. Whilst you can argue that skills like JavaScript, jQuery, Knockout, Angular, etc. are essential for modern day UI development you should remember that the demo site is not a deliverable in itself; we are building it to aid our development process. As such it has far less value than the web API itself.

The same was true for the C#, Razor based web site, which most of us had not used before either. The difference of course is that JavaScript is very different proposition. Experience on the first attempt my other team had at using it was not good - we ended wasting time sorting out JavaScript foibles, such as incompatibilities with IE9 (which the client uses) instead of delivering useful features. The demo site essentially becomes more of a burden than a useful tool. With the C#/Razor approach we had no such problems (after adding the meta tag in the template for the IE9 mode) which meant making features demonstrable actually became fun again, and allowed the site to become more valuable once again.

The Right Tool

I’m not suggesting that Razor in itself was the best choice, for all I know the WinForms approach may have been equally successful. The same could be true for the JavaScript approach, perhaps we were not using the right framework(s) there either [4]? The point is that ancillary technology choices can be more important than the core ones. For the production code you have a definite reason to be using that technology and therefore feel obligated to put the effort into learning it inside and out. But with something optional you could quite easily deny it’s existence and just let the person who picked it support it instead. I don’t think anyone would be that brazen about it; what is more likely is that only the bare minimum will be done and because there are no tests it’s easy to get away without ensuring that part of the codebase remains in a good state. Either that or the instigator of the technology will be forever called upon to support its use.

I’ve been on the other end of this quandary many times. In the past I’ve wanted to introduce D, Monad (now PowerShell), F#, IronPython, etc. to a non-essential part of the codebase to see whether it might be a useful fit (e.g. build scripts or support tools initially). However I’ve only wanted to do it with the backing of the team because I know that as a freelancer my time will be limited and the codebase will live on long after I’ve moved on. I’ve worked before on a system where there was a single PERL script that is used in production for one minor task and none of the current team knew anything about PERL. In essence it sits there like a ticking bomb waiting to go off, and no one has any interest in supporting it either.

As I said at the beginning I’m keen to see how this plays out. After picking up books on JavaScript and jQuery first time around I’m not exactly enamoured at the prospect, but I also know that there is no time like the present to learn new stuff, and learning new stuff is important in helping you think about problems in different ways.

 

[1] My background has mostly been traditional C++ based distributed services.

[2] When someone suggests adding unit tests to your “disposable” code you know you’ve gone too far.

[3] This proxy is the same one used to drive the acceptance tests and so it didn’t cost anything extra to build.

[4] The alarming regularity with which new “fads” seem to appear in the JavaScript world makes me even more uncomfortable. Maybe it’s not really as volatile as it appear on the likes of Twitter, but the constant warnings I get at work from web sites about me using an “out of date browser” don’t exactly inspire me with confidence (See “We Don’t Use IE6 Out of Choice”).

Wednesday, 12 November 2014

UnitOfWorkXxxServiceFacadeDecorator

I’ve seen people in the Java community joke about class names that try and encompass as many design patterns as possible in a single name, but I never actually believed developers really did that. This blog post stands as a testament to the fact that they do. The title of this blog post is the name of a class I’ve just seen. And then proudly deleted, along with a bunch of other classes with equally vacant names.

I’ll leave the rest of this diatribe to The Codist (aka Andrew Wulf) who captured this very nicely in “I’m sick of GOF Design Patterns” [1].

 

[1] Technically speaking the Unit of Work design pattern is from Fowler’s Patterns of Enterprise Application Architecture (PoEAA).

Tuesday, 11 November 2014

Relentless Refactoring

During the first decade of my programming career refactoring wasn’t “a thing” so to speak [1]. Obviously the ideas behind writing maintainable code and how to keep it in tip top shape were there, but mostly as an undercurrent. As a general rule “if it ain’t broke, don’t fix it” was a much stronger message and so you rarely touched anything you weren’t supposed to. Instead you spent time as an “apprentice” slowly learning all the foibles of the codebase so that you could avoid the obvious traps. You certainly weren’t bold enough to suggest changing something to reflect a new understanding. One of the major reasons for this line of reasoning was almost certainly the distinct lack of automated tests watching your back.

No Refactoring

The most memorable example I have of this era is a method called IsDataReadyToSend(). The class was part of an HTTP tunnelling ISAPI extension which supported both real-time data through abuse of the HTTP protocol and later a more traditional polling method. At some point this method, which was named as a query-style method (which generally implies no side-effects) suddenly gained new behaviour - it sent data too in certain circumstances!

I didn’t write this method. I had to badger the original author, after being bitten by the cognitive dissonance for the umpteenth time, to change it to better reflect the current behaviour. Of course in retrospect there was such a strong odour that it needed more than a name change to fix the real design problem, but we were already haemorrhaging time as it was due to HTTP weirdness.

It seems funny now to think that I didn’t just go in and change this myself; but you didn’t do that. There was an air of disrespect if you went in and changed “somebody else’s” code, especially when that person was still actively working on the codebase, even if you had worked with them for years.

Invisible Refactoring

As I became aware of the more formal notion of refactoring around the mid 2000’s I still found myself in an uncomfortable place. It was hard to even convince “The Powers That Be” that fixing code problems highlighted by static code analysis tools was a worthy pursuit, even though they could be latent bugs just waiting to appear (See “In The Toolbox – Static Code Analysis”). If they didn’t see the value in that, then changing the structure of the code just to support a new behaviour would be seen as rewriting, and that was never going to get buy-in as “just another cost” associated with making the change. The perceived risks were almost certainly still down to a lack of automated testing.

Deciding to do “the right thing” will only cause you pain if you do not have some degree of acceptance from the project manager. My blog posts from a few years ago pretty much says it all in their titles about where I was at that time: “Refactoring – Do You Tell Your Boss” and “I Know the Cost but Not the Value”. These were written some time later on the back of a culture shift that appeared to go from “refactoring is tolerated” to “stop with all this refactoring and just deliver some features, now”, despite the fact that we were hardly doing any and there was so much we obviously didn’t understand about what we were building.

Prior to that project I had already started to see the benefits when I took 3 days out to do nothing but rearrange the header files in a legacy C++ system just so that I could make a tiny change to the one of the lowest layers [2]. Up to that point it was impossible to test even a tiny change without having to connect to a database and the message queuing middleware! The change I had to make was going to be tortuous unless I could make some internal changes and factor out some of the singletons.

Deciding to embark on such a pursuit, which I did not believe would take 3 whole days when I started, was incredibly hard. I knew that I was going to have to lie to my manager about my progress and that felt deeply uncomfortable, but I knew it would pay huge dividends in the end. And it did. Once I could get unit tests around the code I was going to change I found 5 bugs in the original implementation. My change then became trivial and I could unit test the hell out of it. This was incredibly significant because testing at the system scale could easily have taken well over a week as it was right in the guts of the system and would have required a number of services to be running. My code change integrated first time and I was spared the thought of trying to debug it in-situ.

That 3 days of hard slog also paved the way for unit testing to become a viable proposition on the infrastructure aspects of the codebase. Sadly the way COM was overused meant that there was still a long way to go before the rest of the stack could be isolated (See “Don’t Rewrite, Refactor”).

Visible Refactoring

After hiding what I had been doing for a few years it was nice to finally be open about it. The project that had taken a turn for the worse slowly recovered once it had gone live and the technical debt was in desperate need of being repaid. However this was really about architectural refactoring rather than refactoring in the small. Even so the effects on the design of not doing it were so apparent that it became tolerated again and so we could openly talk about changing the design to iterate it towards providing the basis for some of the features that we knew were important and still on the backlog. This is far less desirable than just doing the work when it’s actually required, but at least it meant we could make forward progress without being made to feel too guilty.

Relentless Refactoring

The last few projects I’ve worked on have taken refactoring to a new level (for me), it’s no longer something to feel guilty about it’s now just part of the cost of implementing a feature. As our understanding of the problem grows so the design gets refined. If the tests start to contain too many superfluous details they get changed. The build process gets changed. Everything is up for change with the only barrier to refactoring being our own personal desire to balance the need to deliver real functionality, and therefore direct value first, without letting the codebase drift too far into debt. When everything “just works” the speed at which you can start to turn features around becomes immense.

By being transparent about what we’re doing our clients are completely aware of what we are up to. More importantly they have the ability to question our motives and to channel our efforts appropriately should the schedule dictate more important matters. I don’t have any empirical evidence that what we’re doing ensures that we deliver better quality software, but sure feels far more friction free than most other codebases I’ve ever worked on, and I don’t believe that is solely down to the quality of the people.

Today I find myself readily tweaking the names of interfaces, classes, methods, test code, etc. all in an attempt to ensure that our codebase most accurately reflects the problem as we understand it today. My most recent post on this topic was “Refactoring Or Re-Factoring” which shows how I believe the message behind this technique has changed over time. In my new world order a method like IsDataReadyToSend() just wouldn’t be given the time to gain a foothold [3] because I now know what a mental drain that can be and quite frankly I have far more important things my customer would like me doing.

 

[1] This was the mid 1990’s so it probably was “a thing” to the likes of Beck, Fowler, etc. but it wasn’t a mainstream practice.

[2] This was largely an exercise in replacing #includes of concrete classes with forward declarations, but also the introduction of pure abstract base classes (aka interfaces) to allow some key dependencies to be mocked out, e.g. the database and message queue.

[3] Unless of course it’s made it’s way into a published interface.

Thursday, 6 November 2014

AOP: Attributes vs Functional Composition

A colleague was looking at the code in an ASP.NET controller class I had written for marshalling requests [1] and raised a question about why we weren’t using attributes. The code looked something like this:

public class ThingController : ApiController
{
  public HttpResponseMessage Get(string id)
  {
    return Request.HandleGet(() =>
    {
        return _service.FetchThing(id);
    });
  }
}

. . .

public static class RequestHandlers
{
  public static HttpResponseMessage(
                     this HttpRequestMessage request,
                     Func<T> operation)
  {
    try
    {
      // Do other stuff
      . . .
        T content = operation();
        return request.FormatResponse(content);
    }
    catch(. . .)
    {
      // Translate various known exceptions, etc.
    }
    . . . 
  }
}

The HandleGet() method effectively wraps the internal call with a Big Outer Try Block and does some other stuff to help ensure that the response is suitably formatted in the face of an exception. At some point it will undoubtedly instrument the internal call and add some extra nuggets of logging so that we can see the thread ID, request causality ID, etc. I’ve used this technique for a long time and wrote about it in more detail a few years back in “Execute Around Method - The Subsystem Boundary Workhorse”.

As my colleague rightly pointed out the behaviour of the HandleXxx() methods was really a spot of Aspect Orientated Programming (AOP), something I also noted myself at the end of that blog post. The question was whether it would be better to lift all that out and use a set of attributes on the method instead and then wire them up into the HTTP pipeline? I guess it would look something like this:

public class ThingController : ApiController
{
  [TranslateErrors][InstrumentCall][LogCall]
  public HttpResponseMessage Get(string id)
  {
      return _service.FetchThing(id);
  }
}

I wasn’t convinced; I don’t personally believe this makes the code considerably better. And if anything it makes it harder to test because all that logic has been lifted out of the call chain and pushed up the call stack where it’s now become part of the infrastructure, which I guess is the point. Yes, you can unit test the implementation of the attribute’s behaviour just as you can the HandleXxx() methods, but you can’t easily test it in-situ. Whilst testing happy paths via acceptance tests is easy, testing recovery from failure scenarios is often much harder and is where I would look to use component-level tests with fault-injecting mocks.

Any argument around forgetting to wrap the internal call with HandleXxx() can largely be dismissed as you could just as easily forget to add the attributes. Perhaps if the behaviour was mandatory so that you didn’t even have to do that on any API entry point it would be beneficial, but testing any non-happy path that shouldn’t return a 500 would highlight the missing HandleXxx() mistake right away. If it’s a genuine concern that there are so many entry points that one could be forgotten, I would suggest that the smell of a lack of Separation of Concerns was more prominent.

Most of the services I’ve worked on in recent years are small (by design) and so only have a handful of operations to implement. Those services might be fronted by ASP.Net, WCF, message queue or even a classic Socket listener, so the pattern remains pretty much the same. So, whilst the pattern is similar one might argue that it’s not idiomatic ASP.Net.

My main argument though is that I’ve watched in amusement as developers tie themselves in knots and try to debug services made all the more complex by frameworks that allow all manner of things to be injected here and there with very little diagnostic support for when things go wrong. Although composing the functions might add a little extra cognitive load to the handler when you aren’t interested in the tangential concerns, it’s always there as an explicit reminder that servicing a request has a lot of potentially hidden costs around it. If you ever needed to profile those concerns (which I have in the past) it is trivial to host the class in noddy test harness and profile it the same as any other code. This is an aspect of code reuse and low coupling that I personally value highly.

I’m sure once again that I’m just being a luddite and should embrace more of the underlying technology. But as I get older I find I have a lower tolerance for wrestling with some complex tool when I know there is another simple solution available. This quote from Brian Kernighan always sits in the back of my mind:

“Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?”

I also know we can always refactor to a better position in the future without much effort, if the need arises, because the tests will be in place already which is often what makes this kind of refactoring more time consuming. This makes we wonder whether that’s what’s causing me to be cautious - confidence in my own tests?

 

[1] To me the ASP.Net controller classes are purely for marshalling data into the underlying service call and marshalling the data back out again for the response [2]. Actually most of that is handled by the framework so it’s really just exception translation and mapping responses, e.g. translating a missing resource to a 404. Whilst all the logic could go in the controller I find Visual Studio web projects very clunky in comparison to consuming simple class libraries and console applications / services.

[2] As I wrote that I wondered why the same is not true about using model binders to aid in the validation of the input arguments. I concluded that the validation logic is in the domain types, not the binder, which is merely an adapter. If required it would be fairly trivial to replace the typed parameters as strings and invoke the conversion logic in the handler body. In fact this is exactly how it was done in the first place.