Saturday, 6 July 2013

The Application-Wide Diagnostic Log - the One Acceptable Global Object?

Like many of my peers I have a major dislike for the Singleton pattern. Also like them this has probably come about from attempting to isolate code to write (unit) tests. Singletons can make that really hard. However, there are only two [1] things wrong with them:-

  1. Access is through a global variable which means it’s hard to see all the dependencies a piece of functionality relies upon
  2. There can only be one instance, ever and its lifetime is largely out of your control

Of course over time developers have managed to work around the second deficiency by adding hacks to allow you to switch the implementation during testing and to manually control the lifetime (an absolute must for a COM object). Even so we’re still left with the global variable problem.

Parameterise From Above

The obvious answer to global variables is to just pass them into functions as parameters instead [2]. A common complaint is that there are often too many to pass around, like database connections and remote service proxies. This leads to code such as this below where you’re stitching together components inside the logic:-

public List<Bug> FindOpenBugsForUser(string login)
{
    var connection = Connection.Instance;
    var dal = new BugDataMapper
(connection);

    var bugs = dal.SelectBugs(login);

    return bugs.Where((bug) => bug.IsOpen)).ToList();
}

The alternative is to not only pass something in, but also to make sure that the something is a higher-level abstraction. That is, define an interface for the logical service that this represents, and then pass that in from above:-

public interface IBugService
{
    List<Bug> SelectBugs(string login);
}

public List<Bug> FindOpenBugsForUser(string login,
                              IBugService bugService)
{
    var bugs = bugService.SelectBugs(login);

    return bugs.Where((bug) => bug.IsOpen)).ToList();
}

. . .

public BugService : IBugService
{
    private Connection _connection;
    . . .
}

By adhering to the Separation of Concerns and crating higher-level abstractions you can mitigate the most common complaint against passing everything around by parameter.

Diagnostic What?

Good. So hopefully that opening paragraph will convince you what I’m going to suggest a bit later doesn’t mean I’m totally cuckoo.

Although I have a dislike of logging per-se (a post for another day) I still do it [3]. I’ve even written about some of the things you’ll want to consider when you’re using logging as a diagnostic tool in “Diagnostic & Support User Interfaces”. And that is specifically the kind of logging I’m referring to here - log messages designed to aid in the support of an application or system. What I’m definitely not talking about is custom logs generated to provide some specific report on one particular feature within an application (e.g. a data import exception report). Those sort of reports are quite often a functional requirement and so probably require testing in their own right.

Nope, what I’m talking about here is the application-wide log. This is commonly used to persist diagnostic events (to a file, the event log, etc) to help the support and development teams understand what is happening inside the code. In a suite of applications, or a system comprised of many components there is a desire to keep things uniform; this applies both to the code used to generate those events and also to the (persistent) format in which its rendered.

Desirable Properties

Given that one of the main bugbears of a singleton logger is how it makes testing hard, let’s take a test-first approach to deciding what kind of behaviour we’d need from any design we end up choosing:-

  1. It must, by default have no impact on unit testing. I should not need to do anything to write a test for existing code or be able to add logging to existing code without it breaking the build
  2. Unless of course I actually want to sense the logging is happening, such as in a non-functional test like bootstrap code or where the Execute Around Method pattern has been used to hide an internal mechanism
  3. When I do come to mock the logger I should only need to implement a simple interface and have no other dependencies on any of the production logging code (or 3rd party logging library), except perhaps for simple types like a “severity” enumeration
  4. If I do need more control over how my code writes log messages, such as by enabling extra logging on-the-fly, I should still be able to write to the application-wide log through the appropriate mechanism

Basically what I’m suggesting is that we employ the Pareto Principle to diagnostic logging - 80 percent of our code accesses the log in the easiest way possible and minimises the noise in the code, whilst the other 20% does its own thing whilst fitting in with the conventions laid out earlier. Of course in real life it’s more like 99% / 1%. In fact in the 20 years I’ve been programming professionally I can count on one finger the number of times I’ve needed to do custom logging. It was in an ISAPI extension and we needed to offload the writes for performance reasons. Of course I’ve only worked on about a dozen or so different “products” in that time so it’s not exactly a sound statistical measurement - your mileage will undoubtedly vary.

Logger Interface + Static Facade

And so we finally come to the “pattern” I’ve seen used and have continued to use myself in the absence of anything better. Essentially it boils down to a static facade which provides the level of convenience for the vast majority of code, combined with an interface used to decouple the actual log sink(s) implementation from the caller.

The static facade looks like this:-

public static class Log
{
  // The application-side log.
  public ILogger ApplicationLog { get; set; }

  // Write an error message
  public void Error(string format, params object[]
                                              args)
  {
    if (ApplicationLog != null)
      ApplicationLog.Error(format, args);
  }
  . . .
  // Other helpers for logging warnings, etc. 
}

What should be fairly obvious is that our first requirement is met by virtue of there being no logger configured by default. The “!= null” checks in the forwarding methods also means that any attempt to log when nothing is configured is benign. That’s our core unit testing scenarios catered for.

The public access to the actual logger implementation (the ApplicationLog property) also helps us in the odd cases where we need to mock the logger. It also aids our fourth requirement where we might want more control as we can just pass the real logger in from above. Extra “!= null” checks would be needed in the client code and so one alternative would be to create an Adapter that forwards the logging calls to the static facade, thereby letting the facade handle the presence or absence of a logger implementation.

The logger interface is essentially just a non-static version of the same API:-

public class ILogger
{
  // Write an error message
  void Error(string format, params object[] args);
  . . .
  // Other helpers for logging warnings, etc. 
}

That pretty much sorts out our third requirement. I can easily create a Null Object implementation, although with the checks that’s fairly pointless unless you want to avoid them. I can also easily mock it by storing the messages in a List<string> which makes testing a doddle.

public class LoggerFake
{
  public List<string> Messages = new List<string>();

  // Write an error message
  public void Error(string format, params object[]
                                               args)
  {
    var message = String.Format(format, args);

    Messages.Add(message);
  }
  . . .
}

Creating the Logger

So, if the default behaviour is not to create a logger, you might be wondering where it gets created. Why, in Main() of course. As I suggested in “Assume Failure by Default”, Main() is the place to bootstrap your process by invoking any factories to create and associate your various higher-level abstractions. You might not actually be in control of the real Main(), unless you put a little legwork in, such as with WCF hosting. In which case your Main() might be an Initialise() method that is invoked instead by whatever your container process is.

Conclusion

Well, there it is. By this point you’re probably pumped and shouting obscenities at the screen claiming I’m an imbecile. For one thing I’ve got a public property on the facade that surely just any old developer could change? That’s true, but one of the major problems with the Singleton pattern stems from its attempt to try and protect developers from themselves - engineering is all about trade-offs. In my view anything that makes testing easier is generally A Good Thing, even if it opens other back doors that could be abused. Just so long as they’re hard to abuse.

As I see it diagnostic logging just is a different beast. On the one hand it has the ability to cut across so much of the codebase, and yet its non-functional nature means that if it doesn’t get executed you’re no worse off. This is in stark contrast to something like a database connection or service proxy where it has a direct impact on the functional behaviour of the software.

Perhaps I am barking up the wrong tree? But so far this one concession to global variables has, I believe, been worth putting up with. Perhaps like the much derided GOTO they do still have a place, but only as long as you go in with your eyes wide open.

 

[1] I’m assuming you know that these also happen to be the two defining characteristics of the Singleton pattern.

[2] The PFA Papers by Kevlin Henney is an excellent read if you want a more serious critique of the whole affair.

[3] Much as I agree with what Steve Freeman has to say on the subject, I can also see why he says “I won’t go through the whole discussion here but, somehow, I don’t think I’m winning this one”. I just still can’t get my head totally around this idea.

4 comments:

  1. Nope I dont buy it.

    I would offer Encapsulated Context to the mix, you can freely ad a Facade if you want.

    http://www.allankelly.net/patterns/encapsulatedcontext.html

    ReplyDelete
  2. I've currently settled on a mix of a log target that can be passed around, and a TLS mechanism that can locate the current log target for the current thread. The log target is captured into a function local logger, so all logging in a function involves no further look-up.

    A default log target is created on program start-up - which just logs fatal error messages to the console only. Scoped log targets can then be created by anyone at any point and anything within that scope gets it (whether passed it or looking up in TLS). This makes it easy to test/ mock. It also allows us to bridge to a host logging system when running in a certain context - with no dependencies on that system at the lowest levels.

    So we appear to have our cake and eat it.
    It also performs pretty well.

    Of course some of the design decisions here were based on it being a large legacy system. Starting from scratch I usually try to use PfA throughout - usually via some form of context object (as Allan suggests). It's doable but I've not had opportunity to do it at sufficient scale to completely commit to it just yet. I still find context objects tend to become a little brittle - but perhaps I'm just not refined enough in my usage yet.

    ReplyDelete
  3. Encapsulated Context (EC) would be the pattern of choice, except that it feels more like an anti-pattern to me to be passing the same context to *every* single method throughout the codebase. Even if you don't pass it up front, the ripples when you do need to could be huge.

    The pattern description also suggests to me that logging is a functional requirement there, whereas I'm specially targetting the non-functional diagnostic logging. Because it's non-functional that also means it doesn't stop the code being used in any other environment, which EC lists as a benefit.

    To date I cannot think of a time when EC (as applied specifically to diagnostic logging) would have been benefitial, it could only have added a lot of unnecessary noise, IMHO.

    ReplyDelete
  4. I've vacillated on this over the years.

    As I say, where I am currently is thinking that logging is something where the benefits of the global access may outweigh the downsides - if you can minimise the downsides.

    In my case having the ability to still swap out the log target for a mock one for testing is key.
    It's also thread safe (I don't admit this to everyone but we actually have a thread safe phoenix singleton template - taking into account all the issues with DLCP in C++).

    So I'm fairly happy with the compromises. For me the jury is still out on using context objects for this sort of thing. I've made it work on the small scale but even then not been convinced it's worth it when it gets too fine grained.

    ReplyDelete