Tuesday, 20 December 2011

Diagnostic & Support User Interfaces

When people talk about user interfaces they are invariably talking about the face of a product or system that the end-users see and/or interact with. But within a complex system there are other user interfaces too that are not seen or interacted with by end-users at all; instead it is fellow developers and, more importantly, support staff that get to see this less salubrious side.

In a small well-established team of developers there will naturally be a lot of cross fertilisation meaning that they will be well versed in significant portions of the system. In contrast, a larger team will have skills spread across different disciplines and so from a support perspective it starts to approach the knowledge levels of a support team - far less. In my experience support teams do not get the luxury of devoting themselves to learning one system instead out; instead they invariably have to support a whole bunch of in-house systems written by different teams using different technologies and having different ideas about what is required to make a product or system “supportable”.

So what kinds of interfaces am I referring too? Well, the most traditional are status monitoring pages and custom admin tools; the kind of utilities built specifically for the purpose of administering and supporting the system. Due to the time constraints imposed and the lack of direct business value they are usually not given any real TLC but are just thrown together, presumably under the misguided assumption that it will never really be used except by them or their close colleagues. But these scripts and command line tools are not the only faces of the system you’ll see when doing support duties; before that you’ll probably be faced with hunting through the custom format log files, which may or may not be easily consumable. Stretching the analogy somewhat further you could encompass configuration files and the database schema as interfaces of sorts. They’re not graphical in themselves, but careful though about their structure can ensure that the off-the-shelf tools stand more chance of making interaction a more productive experience.

Anyway, here is my list of things to think about. I’ve not done any “production” UI programming for a few years and it’s been even longer since I read About Face (the 1st edition no less) so I’ll no doubt contradict some modern wisdom along the way. Your target audience may not be novice computer users, but they do have the clock against them and they won’t thank you for making an already difficult situation any harder.

Command-Line Tools

Server-side developers aren’t renowned for their user interface programming. After all, that’s why you have UX designers and developers whose major skills lie in the area of UI programming. But that doesn’t mean we have be to sloppy about the interfaces for the scripts and tools we do create. I’m not suggesting for a moment that we spend hours gold-plating our utilities, but I am proposing that you adhere to some of the common conventions that have been established over the years:-

Provide a help switch (-?, -h, --help) so that the user can see what the options are. Documentation, if you can even find it for a utility, can easily get out of step with the tool usage and so you need to be able to see which switch it is that you’re apparently missing, or that is no longer relevant for the task at hand. Even if you’re au fait with the tool you might just want to double-check its usage before going ahead and wiping out last years data.

Provide a version switch (-v, --version) so the user can check its compatibility, or if it’s become out of date. Many tools are installed on the application servers or alongside client binaries, but sometimes those tools are just “passed around” and then become available on some desktop share without you realising. You can even dump the version out on every invocation as the first line; this can be particularly useful if you’re capturing stdout when running it as a scheduled task.

Provide long switch names (--switch-name) by default and use them in documentation so it is obvious what the example command is doing. Short switch names are for advanced users and so should be considered the optional case. Positional parameters may save the user some typing but when you use concepts like integer based surrogate keys it’s not exactly obvious what an example may be suggesting:-

C:\> DC 1234 2001-02-03 1

compared to:-

C:\> CustomerAdmin delete --customer-id 1234
  --date 2001-02-03 --force

Use a consistent switch format. I’ve had to use utilities where one developer has made the switch names case-sensitive and another case-insensitive. Sometimes they use ‘/’ as the switch marker and other times it’s a ‘-‘. Often the case-sensitivity issue is down to lazy programming - using == or Equals() instead of the correct method. If you’re on Windows try and support both styles as even Microsoft’s developers aren’t consistent. Better yet download a 3rd party library that lets you use either style as there are plenty to choose from[#].

Return a meaningful result code. The convention (on Windows) is that zero means success and non-zero means failure or partial success. Some tools do provide more detailed result codes, but fundamentally you want assume that anyone trying to automate your tool on Windows is going to write something along these lines as a first-order approximation:-

CustomerAdmin.exe --do-something-funky
if errorlevel 1 exit /b 1

Sadly, 0 is also the default result code that most frameworks will return for an unhandled exception which really makes the whole process rather messy and means that you need to get a Big Outer Try Block in place in main() as part of your bootstrap code.

Support a read-only/test/what-if mode. Most of my support tools have grown out of test harnesses and other development tools and so I’ve naturally added switches that allow you to run them against production in a benign mode so that you can see what would happen. PowerShell has taken this idea forward formally with its common -WhatIf switch. When you’re not 100% sure of what the expected action is or how many items it will act upon this can be a life saver.

Most of the same comments apply to scripts, but they have the added advantage that they are deployed in source code form and so you have the ability to see the logic first hand. Of course when you’re in the thick of it you need to be able to trust the tool to do what it says it will.

Logging

When faced with a production issue, log files will often become the first port of call after you’ve established which process it is that has failed. But log files[+] often serve two different masters and the latter one can get forgotten once the development work has finished. During development it is probably the developer that spends the most time pouring over the output and therefore they have a habit of dumping every last bit of detail out. After the feature is complete though the main consumer then becomes the support team. What they don’t want to see is reams of output so that the proverbial wood is lost behind the trees.

Make sure your log messages make sense. This may seem obvious but spelling mistakes, whilst cute in a time of stability, are a distraction when you’re up to your eyes in it. Log messages should form a narrative that lead you up to the moment of interest, and that narrative should be written with the reader in mind - mostly likely a support engineer - someone who doesn’t know the code intimately. By all means be succinct, but ensure you’re not being ambiguous either.

Use the console output for high-level commentary, warnings and errors only. If extra output is required it should be enabled by an extra --verbose switch (which can easily be added temporarily under automation) or by having a companion log file with more detail. I tend to favour the second approach so that the detail is always there if needed but so that it’s not in your face the majority of the time. Remember we’re talking about in-house stuff here and so you have the luxury of controlling the environment in ways a 3rd party can often only dream of.

Keep to a low number of severity levels. If you have a large number of severity levels developers will not be able to log messages consistently. I recently saw a tweet that suggested you use only the following levels - FYI, WFT & OMG. If you put yourself in the support person’s shoes that’s actually a pretty good scale. Once the issue is triaged and the development team are involved the TRACE or DEBUG level output suddenly becomes of more use, but until then keep it simple.

Write the messages as text. A simple lined-based text format may seem very old-fashioned, but there are a plethora of tools out there designed to slice-and-dice text based output any way you want, e.g. grep, awk & sed. Plus you have modern GUI based Tail programs that support highlighting to really make those important messages shine, e.g. BareTail & LogExpert. Layer on top something like LogParser which supports a SQL like syntax and support for many formats out of the box and you’re in good shape to triage quickly. Also don’t underestimate the importance of NotePad as a tool of last resort.

Make the log message format simple. If you’re writing a separate log file include at least the date & time, and preferably the process ID (PID). If the tool is multi-threaded then the thread ID (TID) will be invaluable as well, along with the highest precision[*] of time (e.g. milliseconds or better) to spot potential race conditions. I’ve found the following format to be very easy to scan by eye and parse by tool as the left-hand side is of fixed width (the PID & TID are padded appropriately):-

<ISO date & time> <PID> <TID> <severity> <message...>

2001-01-01 01:02:03.456 1234 3456 INF Starting...
2001-01-01 01:02:04.123 1234 5678 ERR Stuff broke...

Enclose string values in some form of quotes or braces to make empty strings more visible. As I posted very recently, empty strings are a form of Null Object and so they have a habit of being silently propagated around before finally causing odd behaviour. Personally I like to enclose strings in single quotes because I often have to use those values in SQL queries and so then it’s already quoted for me. As an example I once spent far longer than necessary investigating an issue because the log message said this:-

Valuing trade ...

instead of:-

Valuing trade ''...

Include all relevant context, preferably automatically. Messages that just tell you something bad happened are useless without the context in which the operation is taking place. There is nearly always some higher-level context, such as a request ID or customer name that you can associate with the PID or TID to help you narrow down the data set. Debugging a complex or lengthy process just so you can find out the set of inputs to find that the data is dodgy is not a good use of time and costs the business money, especially if the relevant variables are in scope at the point of writing the error message![^]

Don’t spam the log, write a summary. If you have a process that deals with particularly dirty data, or you have a service that is temperamental try not to log every single indiscretion as a warning or error. Instead try and batch them up, perhaps by item count or time window and write a single message that gives a summary of the issue. At a high-level you’ll just be interested in seeing if the problem is getting worse and so using a tolerance is even better as once you have an idea of what “normal operation” is you can save the scary messages for when they’re really needed.

One of the reviews I do during system testing is to look at the log output. I try and put myself in the position of the support engineer and read the log messages to see if they make sense to me (I often only have a vague appreciation of the process in question). The thing I’m most looking out for is log spam which is usually solved by downgrading messages from, say, INFO to TRACE so that the main output is all about the big picture. It should contain virtually no warnings or errors to avoid support staff becoming desensitised by noise and when they are present they should hopefully be backed by more detailed diagnostics somewhere else.

 

[*] Don’t be fooled into thinking that because a datetime value can be formatted with a milliseconds part that it is actually that accurate. IIRC 16-bit Windows used to have a resolution of 50 ms whilst the NT lineage was better at 10 ms. If you wanted better accuracy than that you needed to resort to the Performance Counters API.

[#] My own free C++ Core library has a reasonably decent parser should you want to tap into the 3rd party option.

[+] I have mixed sentiments about the use of log files. On the one hand I feel they should largely be unnecessary because we should be able to anticipate, test for and handle most types of failure gracefully. But I’m also aware that in-house systems rarely have that level of robustness as a requirement and so we’re often forced into using it as a defensive measure against inadequate test environments and the other in-house external systems.

[^] I clearly remember spending a few hours at the weekend debugging a financial process that was failing with the message “Failed to correlate pairs”. What annoyed me most was that the developer had the two failing currency names in variables in scope but had not bothered to include them in the message! I fixed the error message first thing Monday morning...

Sunday, 11 December 2011

New SQL Tools - SS-Unit, SS-Cop & sql2doxygen

At my current client I’ve spent far more time working with SQL (and specifically SQL Server) then ever before; but that is probably quite apparent from the bias of blog posts over the last year or so. With any new venture you soon find yourself drawing on your past experiences and in this case I’ve been looking for some of the equivalent tools in the SQL world that I’d use normally use with C# & C++. But I’ve found them hard to come by. It’s possible I’m looking in the wrong places but it’s not as if there are a stream of vendor adverts either trying to distract me and that gives me a feeling it’s not worthy of a prolonged hunt.

There is another reason why I may have not searched overly hard, and that’s because I find it more fun to build something myself - it’s a great learning exercise. Mind you that’s not the right answer for my client where we have both money and far more important things to be doing than building tooling - assuming this is we can find the right alternatives[*]. What I’m releasing below are my clean-room[+] versions of some of the tools we knocked-up to get ourselves going.

Naturally as these begin to surpass what we lashed up they become 3rd party candidates themselves to take over the same duties and my client wins too because it no longer has to consider supporting the code internally. I’ve always made the source code available for my stuff so that others have a chance to tinker or fix bugs and these are no exception. Of course given that they are T-SQL based (or PowerShell in the case of sql2doxygen), it would be pretty hard for me to keep it secret anyway!

SS-Unit - SQL Server Unit

This is a pure T-SQL based unit testing framework. You write your unit tests, in T-SQL, in the familiar xUnit style and then execute them by invoking the SS-Unit test runner (a stored procedure). The framework has support for both the test and fixture level SetUp and TearDown helpers which are more prevalent in SQL tests because of the static data dependencies you often need to satisfy. It also distinguishes between being run interactively, in something like SQL Server Management Studio (SSMS), and being run in batch mode with, say, SQLCMD so that it can be integrated into your Continuous Integration system for automated SQL unit testing.

The .zip file package, which includes a trivial example database and the unit tests for the framework itself is available on my web site here on the SQL section page. There is also an online copy of the manual if you just want to have a nosy.

SS-Cop - SQL Server Schema Cop

Earlier this year I posted about “DbCop”, a StyleCop/FxCop like tool that a team-mate had put together for use on the project. That generated quite a bit of interest and a few people asked if there was any chance of it being released. As I point out in a footnote[+] that isn’t really an option so instead here is my take on the idea. Clearly FxCop does some pretty deep analysis and I’m not suggesting this is going to provide anything of that magnitude. But there are a number of checks that we can do that ensure style conventions are being adhered to and common mistakes, like forgetting a primary key, are avoided. I have a few other new ideas too, such as trying to detect when a foreign key might be missing.

The .zip package is available from the same SQL page as SS-Unit and also contains an online copy of the manual. Not unsurprisingly the unit tests for SS-Cop are written using SS-Unit and so you could consider it a more realistic example of how to use SS-Unit - SS-Cop was written using a Test-Driven Development (TDD) approach.

sql2doxygen - SQL Doxygen Input Filter

Funnily enough the chap that wrote DbCop was the same person that introduced me to Doxygen - a tool for generating API style documentation from source code. What attracted me was the fact that it could generate useful documentation even when no effort had been made by the developer! Better still was that with only a small change in your comment style Doxygen could handle so much more - no horrendous tags or mark up like you get with JavaDoc or the standard C# offering. There are still a bunch of special tags if you need them but they don’t detract from the comments which is what you need when you’re actually trying to read them as you understand the source code.

Anyway, Doxygen supports many programming languages out of the box, but sadly not SQL. However it does provide you with a mechanism for supporting any language as long as you can transform the source files into something Doxygen understands; these are called input filters and are configured with the INPUT_FILTER setting. This is the command line of a process or batch file to execute for each source file and in the case of sql2doxygen it’s a PowerShell script that turns T-SQL code into C.

The sql2doxygen .zip package contains the Doxygen filter script and a manual describing what T-SQL styles the script understands. It is of course available from the same page I’ve mentioned twice already above. Both SS-Unit and SS-Cop are documented with Doxygen using the sql2doxygen filter and so examples of the output are available online here and here.

 

[*] These tools are not deployed into production and so if we picked, say, an open source project that went stale and became unusable, or bought a tool for a lesser known vendor that went bust the direct impact on the business is none. There is of course an indirect impact because we use the tools to help us deliver faster and with better quality and that would be compromised.

[+] Although the stuff we knocked up at my client’s is not the kind of thing I could imagine they would ever try and turn into a product (IT is not their business for starters) the code is clearly their property. I also wouldn’t know where to start trying to get permission to release it. Then there is the whole Fred Brooks approach which is to just throw it away anyway and learn from your mistakes instead. And this is the approach I have decided to take here.

Monday, 28 November 2011

Null String Reference vs Empty String Value

One of the first things I found myself in two minds about when making the switch from C++ to C# was what use to represent an empty string - a null reference or an empty string (i.e. String.Empty).

C++ - Value Type Heaven

In C & C++ everything is a value type by default and you tend to avoid dynamically allocated memory as a matter of course, especially for primitive types like an int. This leads to the common technique of using a special value to represent NULL instead:-

int pos = -1;

Because of the way strings in C are represented (a pointer to an array of characters) you can use a NULL pointer here instead, which chalks one up for the null reference choice:-

int pos = -1;
const char* name = NULL;

But in C++ you have a proper value type for strings - std::string - so you don’t have to delve into the murky waters of memory management[*]. Internally it still uses dynamic memory management[+] to some degree and so taking the hit twice (i.e. const std::string*) just so you can continue to use a NULL pointer seems criminal when you can just use an empty string instead:-

int pos = -1;
std::string name(“”);

I guess that evens things up and makes the use of a special value consistent across all value types once again; just so long as you don’t need to distinguish between an empty string value and “no” string value. But hey, that’s no different to the same problem with -1 being a valid integer value in the problem domain. The c++ standard uses this technique too (e.g. string::npos) so it can’t be all bad…

If you’re less concerned about performance, or special values are a problem, you can happily adopt the use of the Swiss-army knife that is shared_ptr<T>, or the slightly more discoverable boost::optional<T> to handle the null-ness consistently as an attribute of the type:-

boost::shared_ptr<int> pos;
boost::optional<std::string> name;

if (name)
{
  std::string value = *name;
  . . .

C# - Reference Type Heaven

At this point in my transition from C++ to C# I’m still firmly in the empty string camp. My introduction to reference-based garbage-collected languages where the String type is implemented as a reference-type, but has value-type semantics does nothing to make the question any easier to answer. Throw the idea of “boxing values” into the equation (in C# at least) and about the only thing that you can say is that performance is probably not going to matter as much as you’re used to and so you should just let go. But, you’ll want something a little less raw than doing this though for primitive types:-

object pos = null;
string name = null;

if (pos != null)
{
  int value = (int)pos;
  . . .

If you’re using anything other than an archaic[$] version of C# than you have the generic type Nullable<T> at your disposal. This, with its syntactic sugar provided by the C# compiler, provides a very nice way of making all value types support null:-

int? pos;
string name;

if (pos.HasValue())
{
  int value = pos.Value;
  . . .

So that pretty much suggests we rejoin the “null reference” brigade. Surely that’s case closed, right?

Null References Are Evil

Passing null objects around is a nasty habit. It forces you to write null-checks everywhere and in nearly all cases the de-facto position of not passing null’s can be safely observed with the runtime picking up any unexpected violations. If you’re into ASSERTs and Code Contracts you can annotate your code to fail faster so that the source of the erroneous input is highlighted at the interface boundary rather than as some ugly NullReferenceException later. In those cases where you feel forcing someone to handle a null reference is inappropriate, or even impossible, you you can always adopt the Null Object pattern to keep their code simple.

In a sense an empty string is the embodiment of the Null Object pattern. The very name of the static string method IsNullOrEmpty() suggests that no one is quite sure what they should do and so they’ll just cover all bases instead. One of the first extension methods I wrote was IsEmpty() because I wanted to show my successor that I was nailing my colours to the mast and not “just allowing nulls because I implemented using a method that happens to accept either”.

Sorry, No Answers Here… Move Along

Sadly the only consistency I’ve managed to achieve so far is in and around the Data Access Layer where it seems sensible to map the notion of NULL in the database sense to a null reference. The Nullable<T> generic fits nicely in with that model too.

But in the rest of the codebase I find myself sticking to the empty string pattern. This is probably because most of the APIs I use that return string data also tend to return empty strings rather than null references. Is this just Defensive Programming in action? Do library vendors return empty strings in preference to null references because their clients have a habit of not checking properly? If so, perhaps my indecision is just making it harder for them to come to a definitive answer…

 

[*] Not that you would anyway when you have the scoped/shared_ptr<T> family of templates to do the heavy lifting and RAII runs through your veins.

[+] Short string optimisations notwithstanding.

[$] I know I should know better than to say that given there are people still knee deep in Visual C++ 6 (aka VC98)!

Wednesday, 23 November 2011

I Know the Cost but Not the Value

[I wrote the vast majority of this post “the morning after the night before”. I hope that 9 months later I’ve managed to finish it in a more coherent manner than I started it…]

Back in February I made a rare appearance at the eXtreme Tuesday Club and got to talk to some of the movers and shakers in the Agile world. Professionally speaking I’ve mostly been on the outside of the shop looking in through the window wondering what all the commotion is inside. By attending XTC I have been hoping to try and piece together what I’ve seen and read about the whole “movement” and how that it works in practice. As I documented in an earlier post “Refactoring – Do You Tell Your Boss?” I’ve long been struggling with the notion of Technical Debt and how I can help it get sold correctly to those who get to decide when we draw on it and when it’s paid back.

That night, after probably a few too many Staropramen, I finally got to enter into a discussion with Chris Matts on the subject and what I can’t decide is whether we were actually in agreement or, more likely, that I didn’t manage to explain correctly what I believe my relationship is with regards to identifying Cost and Value; in short, I can the former (Cost), but not the latter (Value). So maybe my understanding of what cost and value are is wrong and that’s why I didn’t quite get what he was saying (although I can’t discount the effects of the alcohol either). This post is therefore an opportunity for me to put out what I (and I’m sure some of my colleagues agree) perceive to be our place in the pecking order and the way this seems to work. I guess I’m hoping this will either provide food for thought, a safe place for other misguided individuals or a forum in which those in the know can educate the rest of us journeymen...

From the Trenches

The following example is based on a real problem, and at the time I tried to focus on what I perceived to be the value in the fixes so that the costs could be presented in a fair way and therefore an informed[*] choice would then be made.

The problem arose because the semantics of the data from an upstream system changed such that we were not processing as much data as we should. The problem was not immediately identified because the system was relatively new and so many upstream changes had been experienced that it wasn’t until the dust started to settle that the smaller issues were investigated fully.

The right thing to do would be to fix the root problem and lean on the test infrastructure to ensure no regressions occurred in the process. As always time was perceived to be a factor (along with a sprinkling of politics) and so a solution that involved virtually no existing code changes was also proposed (aka a workaround). From a functional point of view they both provide the same outcome, but the latter would clearly incur debt that would need to be repaid at some point in the future.

Cost vs Value

It’s conceivable that the latter could be pushed into production sooner because there is notionally less chance of another break occurring. In terms of development time there wasn’t much to choose between them and so the raw costs were pretty similar, but in my mind there was a significant difference in value.

Fixing the actual problem clearly has direct value to the business, but what is the difference in value between it being fixed tomorrow, whilst incurring a small amount of debt, and it being fixed in 3 days time with no additional debt? That is not something I can answer. I can explain that taking on the debt increases the risk of potential disruption to subsequent deliveries but only they are in a position to quantify what a slippage in schedule would cost to them.

Perhaps I’m expected to be an expert in both the technology and the business? Good luck with that. It feels to me that I provide more value to my customer by trying to excel at being a developer so that I can provide more accurate estimates, which are very tangible, than trying to understand the business too in the hope of aiding them to quantify the more woolly notion of value. But that’s just the age old argument of Generalists vs Specialists isn’t it? Or maybe that’s the point I’m missing - that the act of trying to quantify the value has value in itself? If so, am I still the right person to be involved in doing that?

I’m clearly just starting out on the agile journey and have so much more to read, mull over and discuss. This week saw XP Day which would have been the perfect opportunity to further my understanding but I guess I’ll have to settle for smaller bites at the eXtreme Tuesday Club instead - if I could only just remember to go!

Epilogue

The workaround cited in the example above is finally going to be removed some 10 months later because it is hopelessly incompatible with another change going in. Although I can’t think of a single production issue caused by the mental disconnect this workaround created, I do know of a few important test runs the business explicitly requested that were spoilt because the workaround was never correctly invoked and so the test results were useless. Each one of these runs took a day to set up and execute and so the costs to the development team has definitely been higher. I wonder how the business costs have balanced out?

 

[*] Just writing the word “informed” makes me smile. In the world of security there is the Dancing Pigs problem which highlights human nature and our desire for “shiny things”. Why should I expect my customer to ever choose “have it done well” over “have it done tomorrow” when there are Dancing Pigs constantly on offer?

Tuesday, 22 November 2011

Cookbook Style Programming

Changing a tyre on a car is reasonably simple. In fact if I wasn’t sure how to do it I could probably find a video on YouTube to help me work through it. Does that now make me a mechanic though? Clearly not. If I follow the right example I might even do it safely and have many happy hours of motoring whilst I get my tyre fixed. But, if I follow an overly simplistic tutorial I might not put the spare on properly and cause an imbalance, which, at best will cause unnecessary wear on the tyre and at worst an accident. Either way the time lapse between cause and effect could be considerable.

If you’re already thinking this is a rehash of my earlier post “The Dying Art of RTFM”, it’s not intended to be. In some respects it’s the middle ground between fumbling in the dark and conscious incompetence...

The Crowbar Effect

The StackOverflow web site has provided us with a means to search for answers to many reoccurring problems, and that’s A Good Thing. But is it also fostering a culture that believes the answer to every problem can be achieved just by stitching together bunches of answers? If the answer to your question always has a green tick and lots of votes then surely it’s backed by the gurus and so is AAA grade; what can possibly be wrong with advice like that?

while (!finished)
{
  var answer = Google_problem(); 
  answer.Paste();
}

At what point can you not take an answer at face value and instead need to delve into it to understand what you’re really getting yourself into?

For example, say I need to do some database type stuff and although I’ve dabbled a bit and have a basic understanding of tables and queries I don’t have any experience with an ORM (Object Relational Mapper). From what I keep reading an ORM (alongside an IoC container) is a must-have these days for any “enterprise” sized project and so I’ll choose “Super Duper ORM” because it’s free and has some good press.

So I plumb it in and start developing features backed by lots of automated unit & integration tests. Life is good. Then we hit UAT and data volumes start to grow and things start creaking here and there. A quick bit of Googling suggests we’re probably pulling way too much data over the wire and should be using Lazy Loading instead. So we turn it on. Life is good again. But is it? Really?

Are You Qualified to Make That Decision?

Programmers are problem solvers and therefore it’s natural for them to want to see a problem through to the end, hopefully being the one to fix it on the way. But fixing the problem at hand should be done in a considered manner that weighs up any trade-offs; lest the proverbial “free-lunch” rears its ugly head. I’m not suggesting that you have to analyse every possible angle because quite often you can’t, in which case one of the trade-offs may well be the very fact that you cannot know what they are; but at least that should be a conscious decision. Where possible there should also be some record of that decision such as a comment in the code or configuration file, or a maybe a task in the bug database to revisit the decision at a later date when more facts are available. If you hit a problem and spent any time investigating it you can be pretty sure that your successor will too; but at least you’ll have given them a head start[*].

One common source of irritation is time-outs. When a piece of code fails due to a timeout the natural answer seems to be to just bump the timeout up to some ridiculous value without regard to the consequences. If the code is cut-and-pasted from an example it may well have an INFINITE timeout which could make things really interesting further down the line. For some reason configuration changes do not appear to require the same degree of consideration as a code change and yet they can cause just as much head-scratching as you try to work out why one part of the system ended up dealing with an issue that should have surfaced elsewhere.

Lifting the Bonnet

I’ll freely admit that I’m the kind of person who likes to know what’s going on behind the scenes. If I’m going to buy into some 3rd party product I want to know exactly how much control I have over it because when, not if, a problem surfaces I’m going to need to fix it or work around it. And that is unlikely to be achievable without giving up something in return, even if its psychological[+] rather than financial.

However I accept that not everyone is like that and some prefer to learn only enough to get them through the current problem and onto the next one; junior programmers and part-time developers don’t know any better so they’re [partly] excused. In some environments where time-to-market is everything that attitude is probably even highly desirable, but that’s not the environment I’ve ended up working in.

It’s Just a Lack of Testing...

It’s true that the side-effects of many of the problems caused by this kind of mentality could probably be observed during system testing - but only so long as your system tests actively invoke some of the more exceptional behaviour and/or are able to generate the kinds of excessive loads you’re hoping to cope with. If you don’t have the kind of infrastructure available in DEV and/or UAT for serious performance testing, and let’s face it most bean counters would see it as the perfect excuse to eliminate “waste”, you need to put even more thought into what you’re doing - not less.

 

[*] To some people this is the very definition of job security. Personally I just try hard not to end up on the The Daily WTF.

[+] No one likes working on a basket-case system where you don’t actually do any development because you spend your entire time doing support. I have a mental note of various systems my colleagues have assumed me I should avoid working on at all costs...

Saturday, 19 November 2011

PowerShell & .Net - Building Systems as Toolkits

I first came across COM back in the mid ‘90s when it was more commonly known by the moniker OLE2[*]. I was doing this in C, which along with the gazillion of interfaces you had to implement for even a simple UI control, just made the whole thing hard. I had 3 attempts at reading Kraig Brockschmidt’s mighty tome Inside OLE 2 and even then I still completely missed the underlying simplicity of the Component Object Model itself! In fact it took me the better part of 10 years before I really started to appreciate what it was all about[+]. Not unsurprisingly that mostly came about with a change in jobs where it wasn’t C++ and native code across the board.

COM & VBScript

Working in a bigger team with varied jobs and skillsets taught me to appreciate the different roles languages and technologies play. The particular system I was working on made heavy use of COM internally, purely to allow the VB based front-end and C++ based back-ends communicate. Sadly the architecture seemed upside down in many parts of the native code and so instead of writing C++ that used STL algorithms against STL containers with COM providing a layer on top for interop, you ended up seeing manual ‘for’ loops that iterated over COM collections of COM objects that just wrapped the underlying C++ type. COM had somehow managed to reach into the heart of the system.

Buried inside that architecture though was a bunch of small components just waiting to be given another life - a life where the programmer interacting with it isn’t necessarily one of the main development team but perhaps a tester, administrator or support analyst trying to diagnose a problem. You would think that automated testing would be a given in such a “component rich” environment, but unfortunately no. The apparent free-lunch that is the ability to turn COM into DCOM meant the external services were woven tightly into the client code - breaking these dependencies was going to be hard[$].

One example of where componentisation via COM would have been beneficial was when I produced a simple tool to read the compressed file of trades (it was in a custom format) and filter it based on various criteria, such as trade ID, counterparty, TOP 10, etc. The tool was written in C++, just like the underlying file API, and so the only team members that could work on it were effectively C++ developers even though the changes were nearly always about changing the filtering options; due to its role as a testing/support tool any change requests would be well down the priority list.

What we needed was a component that maintains the abstraction - a container of trades - but could be exposed, via COM, so that the consumption of the container contents could just as easily be scripted; such as with VBScript which is a more familiar tool to technical non-development staff. By providing the non-developers with building blocks we could free up the developers to concentrate on the core functionality. Sadly, the additional cost of exposing that functionality via COM purely for the purposes of non-production reasons is probably seen as too high, even if the indirect benefits may be an architecture that lends itself better to automated testing which is a far more laudable cause.

PowerShell & .Net

If you swap C#/.Net for C++/COM and PowerShell for VBScript in the example above you find a far more compelling prospect. The fact that .Net underpins PowerShell means that it has full access to every abstraction we write in C#, F#, etc. On my current project this has caused me to question the motives for even providing a C# based .exe stub because all it does is parse the command line and invoke a bootstrap method for the logging, configuration etc. All of this can, and in fact has been done in some of the tactical fixes that have deployed.

The knock-on effects of automated testing right up from the unit, through integration to system level means that you naturally write small loosely-coupled components that can be invoked by a test runner. You then compose these components into ever bigger ones, but the invoking stub, whether a test harness or production process rarely changes because all it does is invoke factories and stitch together a bunch of service objects before calling the “logical” equivalent of main.

Systems as Toolkits

Another way of building a system then is not so much as a bunch of discrete processes and services but more a toolkit from which you can stitch together the final behaviour in a more dynamic fashion. It is this new level of dynamism that excites me most because much of the work in the kind of batch processing systems I’ve worked on recently has focused on the ETL (Extract/Transform/Load) and reporting parts of the picture.

The systems have various data stores that are based around the database and file-system that need to be abstracted behind a facade to protect the internals. This is much harder in the file-system case because it’s too easy to go round the outside and manipulate files and folders directly. By making it easier to invoke the facade you remove the objections around not using it and so retain more control on the implementation. I see a common trend of moving from the raw file-system to NoSQL style stores that would always require any ad-hoc workarounds to be cleaned up. This approach provides you with a route to refactoring your way out of any technical debt because you can just replace blobs of script code with shiny new unit-tested components that are then slotted in place.

On the system I’m currently working with the majority of issues seem to have related to problems caused by upstream data. I plan to say more about this issue in a separate post, but it strikes me that the place where you need the utmost flexibility is in the validation and handling of external data. My current project manager likens this to “corrective optics” ala the Hubble Space Telescope. In an ideal world these issues would never arise or come out in testing, but the corporate world of software development is far from ideal.

Maybe I’m the one wearing rose tinted spectacles though. Visually I’m seeing a dirty great pipeline, much like a shell command line with lots of greps, sorts, seds & awks. The core developers are focused on delivering a robust toolkit whilst the ancillary developers & support staff plug them together to meet the daily needs of the business. There is clearly a fine line between the two worlds and refactoring is an absolute must if the system is to maintain firm foundations so that the temptation to continually build on the new layers of sand can be avoided. Sadly this is where politics steps in and I step out.

s/PowerShell/IronXxx/g

Although I’ve discussed using PowerShell on the scripting language side the same applies to any of the dynamic .Net based languages such as IronPython and IronRuby. As I wrote back in “Where’s the PowerShell/Python/IYFSLH*?” I have reservations about using the latter for production code because there doesn’t appear to be the long-term commitment and critical mass of developers to give you confidence that it’s a good bet for the next 10 years. Even PowerShell is a relative newcomer on the corporate stage and probably has more penetration into the sysadmin space than the development community which still makes it a tricky call.

The one I’m really keeping my eye on though is F#. It’s another language whose blog I’ve followed since its early days and even went to a BCS talk by its inventor Don Syme back in 2009. It provides some clear advantages to PowerShell such as its async behaviour and now that Microsoft has put its weight behind F# and shipped it as part of the Visual Studio suite you feel it has staying power. Sadly its functional nature may keep it out of those hands we’re most interested in freeing.

I’ve already done various bits of refactoring on my current system to make it more amenable for use within PowerShell and I intend to investigate using the language to replace some of the system-level test consoles which are nothing but glue anyway. What I suspect will be the driver for a move to a more hybrid model will be the need automate further system-level tests, particularly of the regression variety. The experience gained here can then feed back into the main system development cycle to act as living examples.

 

[*] In some literature it stood for something - Object Linking & Embedding, and in others it just was the letters OLE. What was the final outcome, acronym or word?

[+] Of course by then .Net had started to take over the world and [D]COM was in decline. This always happens. Just as I really start to get a grip on a technology and finally begin to understand it the rest of the world has moved on to bigger and better things…

[$] One of the final emails I wrote (although some would call it a diatribe) was how the way COM had been used within the system was the biggest barrier to moving to a modern test driven development process. Unit testing was certainly going to be impossible until the business logic was separated from the interop code. Yes, you can automate tests involving COM, but why force your developers to be problem domain experts and COM experts? Especially when the pool of talent for this technology is shrinking fast.

Tuesday, 15 November 2011

Merging Visual Studio Setup Projects - At My Wix End

When the Windows Installer first appeared around a decade ago (or even longer?) there was very little tooling around. Microsoft did the usual thing and added some simple support to Visual Studio for it (the .vdproj project type); presumably keeping it simple for fear of raising the ire of another established bunch of software vendors - the InstallShield crowd. In the intervening years it appears to have gained a few extra features, but nothing outstanding. As a server-side guy who has very modest requirements I probably wouldn’t notice where the bells and whistles have been added anyway. All I know is that none of the issues I have come across since day one have ever gone away...

Detected Dependencies Only Acknowledges .DLLs

In a modern project where you have a few .exes and a whole raft of .dlls it’s useful that you only have to add the .exes and Visual Studio will find and add all those .dll dependencies for you. But who doesn’t ship the .pdb files as well*? This means half the files are added as dependencies and the other half I have to go and add manually. In fact I find it easier to just exclude all the detected dependencies and just manually add both the .dll and .pdb; at least then I can see them listed together as a pair in the UI and know I haven’t forgotten anything.

Debug & Release Builds

In the native world there has long been a tradition of having at least two build types - one with debug code in and one optimised for performance. The first is only shipped to testers or may be used by a customer to help diagnose a problem, whereas the latter is what you normally ship to customers on release. The debug build is not just the .exe, but also the entire chain of dependencies as there may be debug/release specific link-time requirements. But the Setup Project doesn’t understand this unless you make it part of your solution and tie it right into your build. Even then any third party dependencies you manually add causes much gnashing of teeth as you try and make the “exclude” flag and any subsequent detected dependencies play nicely together with the build specific settings.

Merging .vdproj Files

But, by far my biggest gripe has always been how merge-unfriendly .vdproj files are. On the face of it, being a text file, you would expect it to be easy to merge changes after branching. That would be so if Visual Studio stopped re-generating the component GUID’s for apparently unknown reasons, or kept the order of things in the file consistent. Even with “move block detection” enabled in WinMerge often all you can see is a sea of moved blocks. One common (and very understandable) mistake developers make is to open the project without the binaries built and add a new file which can cause all the detected dependencies to go AWOL. None of this seems logical and yet time and time again I find myself manually merging the changes back to trunk because the check-in history is impenetrable. Thanks goodness we put effort into our check-in comments.

WiX to the Rescue

WiX itself has been around for many years now and I’ve been keen to try it out and see if it allows me to solve the common problems listed above. Once again for expediency my current project started out with a VS Setup Project, but this time with a definite eye on trying out WiX the moment we got some breathing space or the merging just got too painful. That moment finally arrived and I’m glad we switched; I’m just gutted that I didn’t do the research earlier because it’s an absolute doddle to use! For server-side use where you’re just plonking a bunch of files into a folder and adding a few registry keys it almost couldn’t be easier:-

<?xml version="1.0"?>
<Wix xmlns="
http://schemas.microsoft.com/wix/2006/wi"> 
  <Product Id="12345678-1234-. . ." 
           Name="vdproj2wix" 
           Language="1033" 
           Version="1.0.0" 
           Manufacturer="Chris Oldwood" 
           UpgradeCode="87654321-4321-. . .">

    <Package Compressed="yes"/>

    <Media Id="1" Cabinet="product.cab" 
                                    
EmbedCab="yes"/>

    <Directory Name="SourceDir" Id="TARGETDIR"> 
      <Directory Name="ProgramFilesFolder"
                             Id="ProgramFilesFolder"> 
        <Directory Name="Chris Oldwood" Id="_1"> 
          <Directory Name="vdproj2wix" Id="_2"> 
            <Component Id="_1" Guid="12341234-. . ."> 
              <File Source="vdproj2wix.ps1"/> 
              <File Source="vdproj2wix.html"/> 
            </Component> 
          </Directory> 
        </Directory> 
      </Directory> 
    </Directory>

   
<Feature Id="_1" Level="1"> 
      <ComponentRef Id="_1"/> 
    </Feature>

  </Product>
</Wix>

The format of the .wxs file is beautifully succinct, easy to diff and merge, and yet it supports many advanced features such as #includes and variables (for injecting build numbers and controlling build types). I’ve only been using it recently and so can’t say what versions 1 & 2 were like but I can’t believe it’s changed that radically. Either way I reckon they’ve got it right now.

Turning a .wxs file into an .msi couldn’t be simpler either which also makes it trivial to integrate into your build process:-

candle vdproj2wix.wxs
if errorlevel 1 exit /b 1

light vdproj2wix.wixobj
if errorlevel 1 exit /b 1

My only gripe (and it is very minor) is with the tool naming. Yes it’s called WiX and so calling the (c)ompiler Candle and the (l)inker Light is cute the first few times but now the add-ons feel the need to carry on the joke which just makes you go WTF? instead.

So does it fix my earlier complaints? Well, so far, yes. I’m doing no more manual shenanigans than I used to and I have something that diffs & merges very nicely. It’s also trivial to inject the build number compared with the ugly VBScript hack I’ve used in the past to modify the actual .msi because the VS way only supports a 3-part version number.

Converting Existing .vdproj Files to .wxs Files

Although as I said earlier the requirements of my current project were very modest I decided to see if I could write a simple script to transform the essence (i.e. the GUID’s & file list) of our .vdproj files into a boiler-plate .wxs file. Not unsurprisingly this turned out to be pretty simple and so I thought I would put together a clean-room version on my web site for others to use in future - vdproj2wix. Naturally I learnt a lot more about the .wxs file format in the process** and it also gave me another excuse to learn more about PowerShell.

If you’re now thinking that this entire post was just an excuse for a shameless plug of a tiny inconsequential script you’d be right - sort of. I also got to vent some long standing anger too which is a bonus.

So long .vdproj file, I can’t say I’m going to miss you...

* OK, in a .Net project there is less need to ship the .pdbs but in a native application they (or some other debug equivalent such as .dbg files) are pretty essential if you ever need to deal with a production issue.

** Curiously the examples in the tutorial that is linked to from the WiX web site have <File> elements with many redundant attributes. You actually only need the Source attribute as a bare minimum.

Sunday, 9 October 2011

C#/SQL Integration Testing With NUnit

Just over 18 months ago I wrote a post about Integration Testing using NUnit, and at the end I stated that I would cover database integration testing in a separate post. Well here it is. Finally!

The first thing to clear up is what scope of testing we’re covering here because it seems that many teams test their database entirely through their Data Access Layer. Now, I’m not saying that’s wrong, but in “You Write Your SQL Unit Tests in SQL?” I describe the reasons why I believe that it’s not the only approach and that in fact there is great value in doing it separately. So, in this post I’m starting out with a fundamental assumption that you’ve already unit tested both your SQL and C# code and that what we’re doing here is building on that level of trust.

Testing what then?

If we’re not testing the behaviour of our SQL or C# code then you may rightly ask what exactly are we testing? Well, as the name Integration Test implies we’re focusing on the interaction between our C# and SQL code. What sits between these two worlds is the client-side database API (which itself is probably layered) and the database server - a lot of code. Not only that but these two worlds have different ideas of how numbers and strings are represented and errors handled.

There are two kinds of common errors that can only come out at this level of testing (and higher) - schema changes and data type mismatches. The former is essentially a breaking change in the database’s public interface, such as the renaming or re-ordering of parameters and result set columns; the breakage can be quite subtle sometimes[*]. A draconian change policy would try to ensure this never happens, but then that also closes the door to refactoring the database schema too. The second common error revolves around data representation, with floating-point numbers being the prime example as databases often have a far more flexible way of defining the scale and precision of non-integral type numbers. Unexpected trailing whitespace caused by the use of a char(n) instead of varchar(n) type can cause surprises too.

These effects are made all the more apparent when you have separate SQL and C# developers because their timescales may differ, along with their code commits, and that can cause feature-level integration testing to be passed up in favour of going straight to system-level testing.

Schema only, no data

It is a pre-perquisite that before you can run your C# tests against your database you must ensure that it is in a known empty state. This means that you cannot just restore a production database, unless you also empty it of data. The alternative is to build it from scratch before running the entire test suite. And that is what we do.

However, because this can take some time we optimise the process. The Continuous Integration build first builds the database and runs the set of SQL tests, then it builds the C# code and runs those tests too. Finally it re-uses the SQL unit test database to run the integration tests, and because this is one single build & testing sequence the SQL/C# source code is guaranteed to be consistent (assuming the check-ins are themselves atomic and consistent at feature-level).

For our day-to-day development we optimise the process even further by creating a one-off backup of the current production schema and then restoring that each time we want to run the integration tests (after having applied any new patches). This restore & patch approach takes just minutes compared to the length of time it takes to create the database from scratch.

The test framework

The tests themselves are written almost exactly as you would write any other in NUnit, but we have our own helpers which are exposed via a base class to work around some of the problems inherent in using a unit testing framework to do other sorts of testing. For example the biggest problem is that you have the large dependency that is The Database to somehow configure for use within the test, and for this we use an environment variable. I generally dislike any sort of configuration like this normally[+], but it’s a necessary evil, particularly when you can’t just assume that every developer (and the build machine) will be using a local copy of SQL Server Express. By default we all have a PERSONAL_DATABASE variable configured that we use day-to-day and can be discovered even if testing through Visual Studio/Resharper. If the need arises a batch file can always be used to redirect the tests to another instance with little fuss.

The skeleton of a test looks something like this:-

[TestFixture, Category(”Database”)]
public class CustomerDatabaseTests : DatabaseTestBase
{
  [Test]
  public void Add_ShouldInsertCustomer()
  {
    using (var connection = Connection)
    {
      . . .
    }
  }
}

As you can see we use a separate category for the database tests so that you only run them when you know your “personal database” is correctly built. The fixture derives from the DatabaseTestBase class as that is how all the helpers are exposed, such as the Connection property that reads the connection string stored in the PERSONAL_DATABASE variable and serves up a freshly opened connection for it.

One design constraint this implies is that we use Parameterise From Above, rather than ugly Singletons to get our connection passed to the code under test. This a good design practice anyway and we exploit that further by ensuring that we control the outer transaction that surrounds it. This is made possible because we also have our own thin wrapper around the underlying .Net database classes. Controlling the real transaction means we can undo all changes made in the test by simply rolling back the transaction no matter which way the test ends (e.g. intercepting Commit()).

Sadly the need to manually scope the connection with a using() requires a level of discipline when writing the test as otherwise the effects can leak across tests. The alternative, which has yet to be implemented by us, is to use that old stalwart Execute Around Method so that we can add as many smarts as we need in the helper method to ensure the test is as isolated as possible. This would make the test look like this instead:-

[Test]
public void Add_ShouldInsertCustomer()
{
  Execute
  (
    (connection) =>
    {
      . . .
    }
  );
}

Arranging

The first part of any test “dance” involves the arrangement of the dependent data and objects - which is hopefully minimal. With databases though it is common practice to use referential integrity to keep the data sane and so that may mean you have add some static data to avoid falling foul of it; it’s either that or drop the constraints which are there exactly to help find bugs.

NUnit supports both fixture and test level SetUp() methods to help keep them readable by factoring out common code. As a general rule we have used the fixture level setup for static (or reference) data that is orthogonal to the test and then used the other method or the test itself for data that is directly relevant to it. You often find yourself inserting the same static data time and again and so you have a choice of whether to create a C# based helper class or create some helper stored procedures in a separate schema so they can be shared across both the SQL and C# worlds, e.g.

[TestFixtureSetUp]
public void FixtureSetUp()
{
  using (var connection = Connection) 
  { 
    connection.Execute(“exec
                      test.InsertCustomerAndDetails
                      (1, ‘Bob’, ‘London’, . . .);”);
  }
}

Acting

The second part of the dance - acting - was pretty much covered in the overview of the framework as you just need to get your test controlled connection into the code under test.

Asserting

The final step is to verify the outcome by writing a bunch of asserts. If you’re just reading data then you can use the same style of setup shown above and then use the normal features of NUnit that you’d use to compare in-memory based sequences. But if you’re writing data then it’s a slightly more complicated affair and this is another area where a bunch of helpers are required. We still use the basic Assert mechanism, but invoke custom methods to query aspects of the data we expect to have written, e.g.

Assert.That(RowCount(“Customer”), Is.Equal.To(1));

Depending on the system you’re building you may be able to round-trip the data and get two tests for the price of one, as long as any caching is disabled between the write/read calls:-

{
  var expected = new Customer(1, . . .);

  dataMapper.InsertCustomer(expected);
  var actual  = dataMapper.FindCustomer(expected.Id);

  Assert.That(actual.Name, Is.EqualTo(expected.Name));
}

However testing more than one thing at once is generally frowned upon, and rightly so because you don’t know whether the read or the write action failed. But you can also end up duplicating a lot of SQL in your test code if you don’t leverage your own API and that creates a test maintenance burden instead. If you’re purely writing data then you may have little choice but to write some form of query:-

{
  . . .
  string actual = String.Format(“SELECT count(*) FROM 
              Customer WHERE Id={0} AND Name=’{1}’”,
              expected.Id, expected.Name);

  Assert.That(RowCount(“Customer”), Is.Equal.To(1));
  Assert.That(QueryResult(actual), Is.Equal.To(1));
}

The NUnit constraint model allows you to build your own more fluent style of interface so that you can avoid hand cranking SQL if you prefer:-

Assert.That(Table(“Customer”).HasRow.Where(“Id”).Equals(expected.Id).And(“Name”).Equals(expected.Name).Go());

One of the benefits of reusing the SQL unit test database is that you’ll be DBO and as such you’ll be able to exploit your god-like status to allow you to get access to the internals and write this sort of test, even if you’ve carefully constructed a tight interface to the database. It may feel painful writing out such long-winded queries but if you’re trying to ensure you maintain high-fidelity of your data through all the layers is there any alternative?

 

[*] The perfect example of a subtle breaking change comes with SQL Server and OLE-DB. In SQL Server any procedure parameter can have a default value, but with OLEDB defaulted parameters have[**] to be at the end of the argument list. A SQL developer could quite rightly add a new parameter anywhere in the argument list and provide a default value to ensure the interface appears to remain unchanged, but if its not at the end an OLE-DB based query would then fail.

[**] This may have been sorted now, but it was definitely still the case a few years ago.

[+] A new developer should be able to just install the version control client software, pull the source code (and preferably tools) locally and be able to build the software and run the tests out-of-the-box.

Wednesday, 5 October 2011

Unit Testing File-System Dependent Code

Way back last year before being distracted by my impending ACCU conference talk I wrote a post about integration testing using NUnit. At the time I was still in two minds about whether or not it was worth the effort trying to mock the file-system API, especially given that you often have some extra layer of code between you and the file-system to actually read and parse the file, e.g. an XML reader. The alternatives seem to be either to focus on writing integration tests that actually do touch the file-system (which is reasonably quick and reliable as a dependencies go) or injecting more abstractions to create other seams through which to mock, thereby allowing you to write unit tests that get you close enough but not all the way down to the bottom.

Of course if you’re creating some sort of data persistence component, such as the aforementioned XML reader/writer, then you probably have a vested interest in mocking to the max as you will be directly accessing the file-system API and so there would be a good ROI in doing so. What I’m looking at here is the code that lightly touches the file-system API to provide higher-level behaviour around which files to read/write or recovers from known common error scenarios.

Impossible or hard to write test cases

The main incentive I have found for making the effort of mocking the file-system API is in writing tests for cases that are either impossible or very hard to write as automated integration/system tests. One classic example is running out of disk space - filling your disk drive in the SetUp() helper method is just not a realistic proposition. Using a very small RAM disk may be a more plausible alternative, but what you’re really likely to want to test is that you are catching an out-disk-space exception and then performing some contingent action. The same can apply to “access denied”[*] type errors and in both cases you should be able to get away with simulating the error by throwing when the code under test tries to open the file for reading/writing rather than when they actually try to pull/push bytes to the file (this assumes you’re doing simple synchronous I/O).

The reason this makes life easier is that the file Open() method can be a static method and that saves you having to mock the actual File object. It was whilst discussing this kind of mocking with my new team-mate Tim Barrass that we made some of the existing API mocks I had written much simpler. Whereas I had gone for the classic facade, interface and factory based implementation without thinking about it Tim pointed out that we could just implement the facade with a bunch of delegates that default to the real implementation[+]:-

namespace My.IO
{

public class File
{
  public static bool Exists(string path)
  {
    return Impl.Exists(path);
  }

  public static File Open(string path, . . .)
  {
    return Impl.Open(path, . . .);
  } 
 
  . . .

  public static class Impl
  {
    public Func<string, bool> Exists =
                             
System.IO.File.Exists; 
    public Func<string, . . ., File> Open = 
                                 System.IO.File.Open; 
    . . .
  }
}

}

The default implementation just forwards the call to the real API, whereas a test can replace the implementation as they wish, e.g. [#]

{
  File.Impl.Exists = (path) =>
  {
    return (path == @“C:\Temp\Test.txt”)
              ? true : false
  }
}

{
  File.Impl.Open = (path, . . .) =>
  {
    throw new UnauthorizedAccessException();
  }
}

This is a pretty low cost solution to build and may well suffice if you only have this kind of restricted usage. You can easily add a simple File mock by using memory based streams if you just need to simulate simple text or binary files, but after that you’re getting into more specialised API territory.

Replacing the file-system with strings

So what about the case where you are using a 3rd party component to provide simple serialization duties? I’m not talking about large complex data graphs here like a word document, but the simpler text formats like .ini files, .csv files or the <appSettings> section of .config files. If you’ve decided to leverage someone else’s work instead of writing your own parser it’s better if the parser exposes its behaviour through interfaces, but not all do. This is especially true in C++ where there are no formal interfaces as such and concrete types or templates are the norm.

However many text file parsers also support the ability to parse data stored as an in-memory string. You can exploit this in your testing by introducing a static facade (like that above) that encapsulates the code used to invoke the parser so that it can be redirected to load an “in-memory” file instead. This allows you to avoid the performance and dependency costs of touching the actual file-system whilst remaining in full control of the test.

namespace My.IO
{

public class XmlDocumentLoader
{
  public static XmlDocument Load(string path)
  {
    return Impl.Load(path);
  }

  public static XmlDocument LoadFromFile(string path)
  {
    // Load document via file-system.
    . . .
  }

  public static XmlDocument LoadFromBuffer(string 
                                            document)
  {
    // Load document from in-memory buffer.
    . . .
  }

  public static class Impl
  {
    public Func<string, XmlDocument> Load = 
                                       LoadFromFile;
  }
}

}

... and here is an example test:-

{
  XmlDocumentLoader.Impl.Load = (path) =>
  {
    string testDocument = “<config>. . .”;

    return XmlDocumentLoader.LoadFromBuffer
                                     (testDocument);
  }
}

Strictly speaking this fails Kevlin’s definition of a unit test because of the “boundary of trust” that we have crossed (into the parser), but we do control the test input and we should be able to rely on a parser giving consistent performance and results for a consistent small in-memory input and so we’re pretty close. Fundamentally it’s deterministic and isolated and most importantly of all it’s automatable.

With a more complex component like an XML parser it may even require a fair amount of work to mock even though you only use a tiny subset of its features; but that in itself may be a design smell.

Reinventing the wheel

The use of static facades is often frowned upon exactly because it isn’t possible to mock them with the non-industrial strength mocking frameworks. I’m a little surprised that the mocking frameworks focus all their attention on the mechanics of providing automated mocks of existing interfaces rather than providing some additional common facades that can be used to simplify mocking in those notoriously hard to reach areas, such as the file-system and process spawning. Perhaps these ideas just haven’t got any legs or I’m not looking hard enough. Or maybe we’re all just waiting for someone else to do it...

 

[*] Depending on the context an access denied could be an indication of a systemic failure that should cause alarm bells to go off or it could just be a transient error because you’re enumerating a file-system that is outside your control.

[+] As he was refactoring the lambda used for the initial implementation the notion of “methodgroups” suddenly came into focus for me. I’d seen the term and thought I roughly knew what it was about, but I still felt smug when I suggested it was a methodgroup a split second before Resharper suggested the same. Another +1 for Resharper, this time as a teaching aid.

[#] I doubt you’d ever really do a case-sensitive path name comparison, but hopefully you get the point.

Monday, 3 October 2011

Your Task Bar Can Hold More Than 7 Windows Open, but Can Your Brain?

Here is something that never fails to amuse me when watching other users[#] - the number of windows open on their desktop. The task bar is an interesting idea that has somehow managed to stand the test of time. Personally I miss the old 16-bit Windows desktop with all those neat little apps that had animated minimised icons like Coffee Mug, Cigarette, Bit Recycler and Tiny Elvis[+]. Exactly what metaphor is the task bar aiming at? Whatever it is it isn’t working because users are still moaning about the fact that they can’t see which icon is the one they’re looking for. And grouping similar icons is just a band aid that was only needed in the first place to overcome the ridiculous one-window-per-web-page model of everyone’s most hated web browser - IE6.

Here’s the thing though. You do know that you can close those windows, don’t you? You may have heard of The Window Tax, but that was an historical event and anyway it was based on the number of windows you had, not how many times you opened them. Making the task bar bigger or stretching it across multiple desktops still doesn’t make it any easier because the fundamental limitation is in your short term memory, not the size of your desktop or how many monitors you can wire up. A long time ago (but in the same galaxy) I went to university where I was taught that the brain can only store 7 “chunks” of short term information (give or take a couple of extra slots) and so you can open as many windows as you like, but you’ll never remember why you opened them all and that’s why the task bar looks cluttered and confusing.

My wife complains that my short term memory is more limited than that of a goldfish, but that’s not true. It’s just that it gets flushed whenever it encounters shiny things, and unfortunately the world is full of shiny things. Therefore I have a policy of closing windows the moment I believe I don’t need them anymore. This also has the jolly useful side-effect of meaning I don’t fall foul to those niggling reliability problems in applications caused by long-term use and I’ve never had the pleasure of (unintentionally) exhausting the Windows desktop heap.

I know these days everyone’s big on recycling and so shouldn’t I be leaving a few spare Explorer and IE windows around rather than expending fresh carbon on firing up new instances whenever the need arises? I don’t think so because most mainstream applications start up pretty quickly these days[~] and companies like Microsoft spend more effort on improving the warm start-up rather than the cold start-up time. Also by judicious use of favourites, network places, desktop icons, etc. I can navigate to the 80% of places I use most often within a few clicks which probably balances out the time lost “tool tipping” all the task bar icons looking for an existing window that I can reuse...

...Oh, wait, so that’s what the Search window in the Start menu is for - finding things on your task bar!

 

[#] And by users I mostly mean other developers. And my wife. And kids.

[+] I wonder what Toggle Booleans are up to these days?

[~] The one exception I have recently come across is the “Issue Management System” that my current client uses. This literally takes minutes to start up and is a truly painful tool to use, not least because every issue I’ve tried raising is met with an error message box that fills the screen and contains more SQL than our entire database!