Thursday, 2 June 2016

My Kingdom for a C# Typedef

Two of the things I’ve missed most from C++ when writing C# are const and typedef. Okay, so there are a number of other things I’ve missed too (and vice-versa) but these are the ones I miss most regularly.

Generic Types

It’s quite common to see a C# codebase littered with long type names when generics are involved, e.g.

private Dictionary<string, string> headers;

public Constructor()
{
  headers = new Dictionary<string, string>();
}

Dictionary<string, string> FilterHeaders(…)
{
  return something.Where(…).ToDictionary(…);
}

The overreliance on fundamental types in a codebase is a common smell known as Primitive Obsession. Like all code smells it’s just a hint that something may be out of line, but often it’s good enough to serve a purpose until we learn more about the problem we’re trying to solve (See “Refactoring Or Re-Factoring”).

The problem with type declarations like Dictionary<string, string> is that, as their very name suggests, they are too generic. There are many different ways that we can use a dictionary that maps one string to another and so it’s not as obvious which of the many uses applies in this particular instance.

Naturally there are other clues to guide us, such as the names of the class members or variables (e.g. headers) and the noun used in the method name (e.g. FilterHeaders), but it’s not always so clear cut.

The larger problem is usually the amount of duplication that can occur both in the production code and tests as a result of using a generic type. When the time comes to finally introduce the real abstraction you were probably thinking of, the generic type declaration is then littered all over the place. Duplication like this is becoming less of an issue these days with liberal use of var and good refactoring tools like ReSharper, but there is still an excess of visual noise in the code that distracts the eye.

Type Aliases

In C++ and Go (amongst others) you can help with this by creating an alias for the type that can then be used in it’s place:

typedef std::map<std::string, std::string> headers_t;

headers_t headers; 

There is a special form of using declaration in C# that actually comes quite close to this:

using Headers = Dictionary<string, string>;

private Headers headers;

public Constructor()
{
  headers = new Headers();
}

Headers FilterHeaders(…)
{
  return something.Where(…).ToDictionary(…);
}

However it suffers from a number of shortcomings. The first is that it’s clearly not a very well known construct – I hardly ever see C# developers using it. But that’s just an education problem and the reason I’m writing this, right?

The second is that by leveraging the “using” keyword it seems to mess with tools like ReSharper that want to help manage all your namespace imports for you. Instead of keeping all the imports at the top whilst keeping alias style using’s separately below it just lumps them all together, or sees the manual ones later and just appends new imports after them. Essentially they don’t see a difference between these two different use cases for “using”.

A third problem with this case of “using” though is that you cannot compose them. This means you can’t use one alias inside another. For example, imagine that you wanted to provide a cache of a user’s full name (first & last) against some arbitrary user ID, you wouldn’t be able to do this:

using FullName = Pair<string, string>;
using NameCache = Dictionary<int, FullName>;

It’s not unusual to want to compose dictionaries with other dictionaries or lists and that’s the point at which the complexity of the type declaration really takes off.

Finally, and sadly the most disappointing aspect, is that the scope of the alias is limited to just the source file where it’s defined. This means that you have to redefine the alias wherever it is used if you want to be truly consistent. Consequently I’ve tended to stick with only using this technique for the implementation details of a class where I know the scope is limited.

Inheritance

One alternative to the above technique that I’ve seen used is to create a new class that inherits from the generic type:

public class Headers : Dictionary<string, string>
{ }

This is usually used with collections and can be a friction-free approach in C# due to the way that the object-initialization syntax relies on the mutable Add() method:

var header = new Headers
{
  { “location”, “http://chrisoldwood.com” },
};

However if you wanted to use the traditional constructor syntax you’d have to expose all the base type constructors yourself manually on the derived type so that they can chain through to the base.

What makes me most uncomfortable about this approach though is that it uses inheritance, and that has a habit of leading some developers down the wrong path. By appearing to create a new type it’s tempting to start adding new behaviours to it. Sometimes these behaviours will require invariants which can then be violated by using the type through it’s base class (i.e. violate the Liskov Substitution Principle).

Domain Types

Earlier on I suggested that this is often just a stepping stone on the way to creating a proper domain type. The example I gave above was using a Pair to store a person’s first and last name:

using FullName = Pair<string, string>;

Whilst this might be palatable for an implementation detail it’s unlikely to be the kind of thing you want to expose in a published interface as the abstraction soon leaks:

var firstName = fullName.First;
var lastName = fullName.Second;

Personally I find it doesn’t take long for some generics like Pair and Tuple to leave a nasty taste and so I quickly move on to creating a custom type for the job (See “Primitive Domain Types - Too Much Like Hard Work?”):

public class FullName
{
  public string FirstName { get; }
  public string LastName { get; }
}

This is another one of those areas where C# is playing catch-up with the likes of F#. Creating simple immutable data types like these are the bread-and-butter of F#’s Record type; you can do it with anonymous types in C# but not easily with named types.

Strongly-Typed Aliases

One other aside before I sign off about typedefs is that in C, C++, and the “using” approach in C#, they are not strongly-typed types, they are just simple aliases. Consequently you can freely mix-and-match the alias with the underlying type:

var headers = new Headers();
Dictionary<string, string> dictionary = headers;

In C++ you can use templates to workaround this [1] but Go goes one step further and makes the aliases more strongly-typed. This means that there is no implicit conversion even when the fundamental type is the same, which makes them just a little bit more like a real type. Once again F# with its notion of units of measure also scores well on this front.

[1] The first article I remember about this was Matthew Wilson’s “True Typedefs”, but the state of the art has possibly moved on since then. The Boost template metaprogramming book also provides a more extensive discussion on the topic.

4 comments:

  1. Since you brought F# up I think it covers you on every single one of your points - but particularly in making the whole thing moot by making it so easy to define your domain types in the first place!
    That said you'd usually keep the generic collection types - nothing wrong with those as a rule.

    Also C++ has the using syntax for type aliases too, now (still not strongly typed, though).

    ReplyDelete
  2. I've pretty much given up trying to do anything sensible in C# now. The type system is just too weak and it's way too much effort to create proper domain types especially when you consider equality.
    The concept of Domain types seems alien to most of the C# people I've worked with and I met with quite a lot of resistance when I mentioned trying to move away from primitive types on a recent code review.

    As Phil says F# makes it really easy to define domain types which also provide proper equality for free. I now write all my code in F# where possible and only use C# at work where nobody really cares about quality anyway.

    ReplyDelete
  3. When you google C# typedef, you discover that most people are horrid programmers ... they blabber about how they love to make things explicit and your IDE can you with typing (on the keyboard), totally oblivious to issues of encapsulation and DRY. This article is refreshingly different ... though you miss some of the awful C# restrictions, e.g., you can't inherit from arrays, and `using` doesn't allow forward referencing. I've got a grid that is a Cell[,] but no way will C# allow me a simple

    alias Grid = Cell[,];

    I could wrap the Cell[,] in a Grid class, but that means a double indirection on every access and a bunch of pointless boilerplate, just because the language designers are authoritarians with ridiculous ideas about forcing their misconceived views on programmers. Another example of this is that you cannot reuse a name in a nested construct, which is a constant source of aggravation when liberally using lambda expressions, and results in bugs, accidentally using the natural outer name instead of the contrived inner name.

    ReplyDelete
  4. You can take some of the pain away by using a general composition class using the "Curiously Recurring Template Pattern" .. see my answer at https://stackoverflow.com/a/51295920/460084

    ReplyDelete