Wednesday, 18 September 2013

Virtual Methods in C# Are a Design Smell

I came to C# after having spent 15 years writing applications and services in C++. In C++ there is no first class concept for “interfaces” in the same way that C# and Java has. Consequently they were emulated by creating a class that only contained pure virtual functions:-

class IConnection
{
  virtual void Open() = 0;
  virtual void Send(Message message) = 0;
  virtual void Close() = 0;
};

C# != C++

In C# however the interface is a first class concept and naturally when I started using C# I automatically made the implementation of my interface methods virtual [1]:-

interface IConnection
{
  void Open();
  void Send(Message message);
  void Close();
}

public class Connection : IConnection
{
  public virtual void Open()
  { }
  public virtual void Send(Message message)
  { }
  public virtual void Close()
  { }
}

Of course I very quickly discovered though that interfaces in C# do not rely on v-tables in the same was as C++ and that an interface method is altogether another concept over and above what virtual methods give you. Throw in Explicit Interface Implementation and you can seriously mess with the Liskov Substitution Principle (the L in SOLID) [2].

Once I understood that interface methods didn’t need to be virtual I also realised that most of the classes I create, and have been creating, are not actually polymorphic. In C++ they tend to look that way because it’s the only mechanism you have, whereas in C# it seems much clearer because of the more formal distinction between implementation inheritance and just implementing an interface. Putting aside my initial fumblings with C# I think I can safely count on one hand (maybe two) the number of times I’ve declared a method “virtual” in C#.

Testability

A common reason to mark a method as virtual is so you can override it for testing. The last time I remember creating a virtual method was for this exact purpose. I had a class I wanted to unit test but it ran “tasks” on the CLR thread pool which made it non-deterministic and so I factored out the code that scheduled the tasks into a separate protected virtual method:-

public class Dispatcher
{
  . . .
  public void DispatchTasks()
  {
    . . .
    foreach (var task in tasks)
        Execute(task);
    . . .
  }

  protected virtual void Execute(MyTask task)
  {
    ThreadPool.QueueUserWorkItem((o) =>
    {
      task.Execute();
    });
  }
  . . .
}

This allowed me to derive a test class and just override the Execute() method to make the asynchronous call synchronous by using the same thread:-

public class TestDispatcher : Dispatcher
{
  protected override void Execute(MyTask task)
  {
    task.Execute();
  }
}

Favour Composition Over Inheritance

Looking back, what I probably missed was a need to separate some concerns. The dispatcher class was responsible for not only finding the work to do, it was also responsible for queuing and scheduling it too. My need to mock out the latter concern to test the former should have been a smell I picked up on. In many cases where I’ve seen virtual methods used since it’s been to mock out a call to an external dependency, such as a remote service. This can just as easily be tackled by factoring out the remote call into a separate class with a suitable interface, which is then used to decouple the caller and callee. This is how you might apply that refactoring to my earlier example:-

public interface IScheduler
{
  void Execute(MyTask task);
}

public class ThreadPoolScheduler : IScheduler
{
  public void Execute(MyTask task)
  {
    ThreadPool.QueueUserWorkItem((o) =>
    {
      task.Execute();
    });
  }
}

public class Dispatcher
{
  public Dispatcher()
    : this(new ThreadPoolScheduler())
  {  }

  public Dispatcher(IScheduler scheduler)
  {
    _scheduler = scheduler;
  }

  public void DispatchTasks()
  {
    . . .
    foreach (var task in tasks)
        _scheduler.Execute(task);
    . . .
  }
  . . .
  private IScheduler _scheduler;
}

I could have forced the client of my Dispatcher class to provide me with a scheduler instance, but most of the time they would only be providing me with what I chose as the default anyway (there is only 1 implementation) so all I’m doing is making it harder to use. Internally the class only relies on the interface, not the concrete type, and so if the default eventually becomes unacceptable I can remove the default ctor and lean on the compiler to fix things up.

Too Much Abstraction?

One argument against this kind of refactoring is that it falls foul of Too Much Abstraction. In this example I believe it adds significant value - making the class testable - and it adds little burden on the client too. In general I’ve worked on codebases where there is too little abstraction rather than too much. Where I have seen excessive layers of abstraction occur it’s been down to Big Design Up-Front because code that is built piecemeal usually has abstractions created out of a practical need rather than idle speculation.

 

[1] Personally I blame COM which had #define interface struct so that you could use the “interface keyword” in your C++ code.

[2] Thanks goes to Ian Shimmings for showing me where using Explicit Interface Implementation goes against LSP for what looks like an acceptable reason - Fluent Interfaces.

No comments:

Post a Comment