Saturday 12 February 2011

Friends Don’t Let Friends Use Varargs

Whilst writing my previous post about being clever with strings and varargs functions I was reminded of another time that I had a run-in with them. Luckily I wasn’t the protagonist this time but a colleague instead. He decided that he wanted to make it easier to add an arbitrary set of strings to a particular class via the ctor. In modern day C++ you might do it via an iterator pair like this:-

{
  const char* words[] = { “cat”, “mat”, “dog” }

  Thingy thingy(words, words + ARRAYSIZE(words));
  . . .
}

However this was many years ago and the project was MFC based so the use of STL idioms were very thin on the ground[$].

Another less obvious choice might be to define a vector, add them to that and define a ctor that takes a vector of strings. But nope, my colleague decided that he could avoid the ugly array definition and performance hit of the vector approach by declaring a varargs ctor like so:-

class Thingy
{
public:
  Thingy(const char* word1, ...);
};

This meant that you could provide an arbitrary parameter list of strings and then terminate it with a NULL parameter:-

{
  Thingy things(“cat”, “mat”, “dog”, NULL);
  . . .
}

Naturally I suggested you don’t mess with varargs and that this was an accident waiting to happen[#]. I did agree that it was not an uncommon technique to define an array of items and use a NULL terminator at the end and that what he was effectively doing was this:-

{
  const char* words[] = { “cat”, “mat”, “dog”, NULL }

  Thingy thingy(words);
  . . .
}

At the time we had a definite need to support up to three arguments which were actually all CStrings. I’m sure I suggested writing three ctor overloads that then call a common internal helper method instead for now:-

class Thingy
{
public:
  Thingy(CString arg1);
  Thingy(CString arg1, CString arg2);
  Thingy(const CString& arg1, . . . arg2, . . . arg3);
};

But no, that was far too much work, his way was simpler and, after all, we were all experienced programmers so we don’t make those kind of silly mistakes…

Sadly I can’t remember when the first bug showed up or why it remained hidden for so long[+], but one did suddenly appear and guess what was missing? Yup, the NULL terminator. Of course now I had to check for any other occurrences of malformed ctor calls and I soon discovered one more. Clearly someone was not aware of this need to add a terminator so I did a little Software Archaeology and rooted around in our VCS to find out who needed a little educating… But of course, it was my esteemed colleague, the very same person who suggested that we wouldn’t make that kind of mistake!

So remember kids – Just Say No to Vargs Functions. Unless of course you’re doing Template Meta-Programming in which case you’ll be [ab]using them for a different reason.

 

[$] Due to it’s age MFC has its own containers and single iterator type and doesn’t adhere to the STL idioms that all all know and love now. Consequently if you spent your years knee deep in MFC code you could very easily to be unaware of them.

[#] Of course this was many years before I became aware of my own misfortunes with varargs function and anyway I’d read all the books by Scott Meyers, Herb Sutter, et al so that made me the fountain of all knowledge, right?

[+] It just goes to show that even with debugging techniques like the compiler using guard blocks and filling uninitialized stack space with a non-zero pattern it’s still possible for the stack to end up correctly formed by luck and not by judgement. That’s especially true when the stack variable is a ‘bool’.

Wednesday 9 February 2011

A Not-So-Clever-Now String Implementation

Back around 1995 when I first started using C++ commercially[+] I picked up an excellent book by the late Paul DiLascia titled “Windows++”. It was about writing Windows applications in C++ and the book developed a framework along similar lines to the then fledgling MFC. I decided that writing my own C++ framework would be a great way to learn C++ and I (somewhat foolishly) decided that I would use a similar code style to MFC so that I could easily reuse any of my framework code in my day job (after making it production ready of course).

Reinventing the Wheel

The heart of any C++ class library back then was the String class. C++ was still evolving and my work involved using MSVC 1.52c (aka Visual C++ 1.5) which had no template[^] support so naturally I created my own ‘classic’ class that managed an ANSI char buffer:-

class CString
{
private: 
  int   bufsize;  
  char* buffer;      
  . . .
};

One of the problems with this class was that it was too easy to screw up[*] when using the printf() family of functions because it had a different memory layout to a plain char*, so if you forgot to invoke the moral equivalent of c_str() you could end up crashing the process because the bufsize member would be interpreted as a string pointer:-


  CString str = “chris”; 
  printf(“hello %s”, str); // BOOM!
}

Swapping the two members around would probably work for the single-string case above, but not for N arguments as the trailing bufsize member would become the next string:-


  CString first = “chris”, last = “oldwood”; 
  printf(“hello %s %s”, first, last); // Still BOOM!
}

Now, I felt that giving up the beauty of the printf() family was too much; I had just spent the last few years learning C and so I had the various format specifiers committed to memory. Windows also had its own variation, wsprintf(), that supported a few other useful extensions such as near/far pointers and the ANSI/Unicode build dependent variants of “%c” and “%s”.

Tempted by the Dark Side

One day whilst debugging, I stepped into some code that passed a MFC string directly into a printf() call and realised that it hadn’t gone bang. As I stepped into the printf() call I realised that the reason it had worked was because an MFC CString object had the same stack footprint as a raw char* pointer. When an object is passed into a varargs style function the object is passed by value, which in the case of a CString is effectively a single pointer. Also because the class only has a pointer based member it has no extra padding or alignment requirements – at least not on the x86 targets I was interested in.

I thought this was rather clever and set about changing my implementation so that it had similar behaviour. It hadn’t registered at first but I later realised the similarity with the implementation of the COM BSTR type which also stores the string length in an area preceding the string contents. And so I switched my implementation to this:-

class CString
{
private: 
  class CStringBuffer 
  {  
      int   bufsize; 
      char buffer[0]; 
  }; 

  char* string;  // Really CStringBuffer.buffer 
  . . .
};

So now I allocate a single chunk of memory for the header (CStringBuffer) + contents (N * sizeof(char)), but instead of storing a pointer to the CStringBuffer header as a member, I slide along the memory block and store a pointer to the first character of the string, i.e. &CStringBuffer.buffer[0]. I used an interesting Microsoft extension- the zero element array. This beast used to be quite common in the Windows headers and is a useful way to create a placeholder for a variable length array that follows a structure in a contiguous piece of memory. The alternative is the standards compliant “char buffer[1]”, but then that makes your sizeof() arithmetic a little more complicated as you have padding to account for along with the extra byte in the array. Of course to access the header structure without littering the code with casts I wrapped the necessaries in an inline method:-

CStringBuffer* CString::GetBuffer()

  return ((CStringBuffer*)string)-1;
}

All very ugly and switching from a C-style cast to a proper reinterpret_cast<> doesn’t make it any less unpleasant. Still, I happily lived in ignorant bliss for well over a decade until I decided to make a conscious effort to drop my CString class in favour of the standard string type std::string – at least for new libraries that is. The conduit was reading Matthew Wilson’s Imperfect C++ and coming across his STLsoft libraries. Suddenly I found a new impetus to deal with the impedance mismatch between MFC/ATL and Standard C++ (something I believe his new book may well be tackling).

A New Hope

The switchover went swimmingly well with my new Core and XML libraries showing the way forward whilst the old WCL, NCL and MDBL libraries became the epitome of the word ‘'legacy’. Where possible I added extra overloads to allow std:string’s to be used naturally but by-and-large the worlds were kept apart and the cost of the occasional implicit[#] conversion from a CString to a std::string was left for the profiler to find, should it ever show up.

Then I decided that where possible all new code, even in my legacy libraries, should adopt the proper string type and that I would slowly migrate code across. So I started making deeper and deeper refactorings (backed by my unit tests – where they existed) and still things looked good. Visual C++ 10 (aka VS2010) arrived on the scene and so a port was required to ensure my codebase was compatible and much to my surprise a unit test in my ANSI to Unicode conversion helper functions choked on an ASSERT inside the STL. The code was wrong, no doubt about it, but why hadn’t I see it before as I usually use STLport to keep me honest? I did a build with STLport 5.2 and it also whinged! So I fixed my iterator dereferencing bug and ran the entire suite to make sure I hadn’t entered into the murky waters of Undefined Behaviour anywhere else by accident and again it when BANG! One of my COM library tests was now failing! Running the test under the debugger revealed a more serious problem as the MS debug runtime started complaining about other matters…

Undefined Behaviour Strikes Back

It took a couple of debugging attempts but then I winced as I spotted my error. I had switched over a couple of helper functions from returning a CString to a std::string and the result of that function was in turn being passed to a varargs logging function like so:-

std::string SomeClass::toString()

  return . . .;
}
. . .
void Elsewhere::DoSomething()

  LOG(“Blah, blah, blah %s”, something.toString()); 
  . . .
}

Yup, there was no call to c_str()!

Then the true cost of my foolish design choice started to dawn on me as I realised that there was no obvious way to catch the problem at compile time because varargs functions are effectively type-less once you extend into the ‘variable’ part of the parameter list. Yes, it might be possible to do something with templates and macros but that would be non-trivial. Add in the fact that I have used varargs style functions to do all my string formatting, often in my exception handling, and you can guarantee that there are plenty of time bombs in the shape of rarely executed exception handlers just waiting to go off.

Another New Hope

Maybe it was fate, but I had always been meaning to get my Core library and its unit test suite to build and run under more than just Visual C++. So I started investigating the alternatives such as GCC and Digital Mars (see “GCC for the Visually C++ Impaired”) and soon I had both Core and XML working under GCC 3.x and 4.x via the Code::Blocks IDE – albeit only with the standard warning level.

After being distracted for some months I finally got back to playing with GCC as I had always intended on compiling my code with the highest warning settings available to see how much static code analysis GCC can do over-and-above what VC++ already does – an awful lot it turns out (a topic for another day). And, guess what one of the behaviours that it singles out? Yup, the passing of non-primitive types to printf() style functions! I’ve yet to dig deeper and see if it is varargs functions in general or whether it’s functions like printf(). Either way it spots code like this:-


  std::string text(“. . .”); 
  printf(“%s”, text);
}

Which is great because I now have a clear path forward for deprecating my CString type safely:-

  1. Add a c_str() method to my CString type to provide compatibility with std::string
  2. Start compiling[$] more of my codebase with GCC 4.x to identify and fix those bugs by appending a call to .c_str()
  3. When a suitable refactoring moment arises switch code from my CString type to the standard string type
  4. Test, test and test some more…

Epilogue

I often wonder whether it’s still worth maintaining my 15 year old C++ codebase of tools. Many of them were made obsolete years ago and so I have stopped fixing up the really old applications but the libraries live on – kept alive by each new utility that I write. I could start again from scratch but you don’t get that luxury in the real world so I’m inclined to try and refactor my way out. This way it remains a learning exercise – albeit one about refactoring legacy code – which sadly is what a lot of commercial C++ projects have become.

 

[+] I first used it at university some years earlier and at that time you used the ‘overload’ keyword to mark overloaded functions. I managed to get G++ (nay GCC) working on my Atari 1040 STE (with twin floppy drives) but it was not a pleasant experience. I already thought C was too bloated in comparison to Assembly Language and C++ was therefore even more pointless. Sooo naive…

[^] Not that I knew what a template was back then of course. I never really started to understand templates until the turn of the millennium and Template Meta-programming is still largely theory to me – that’s why there’s the Boost libraries!

[*] If you were lucky it was in a common code path and you got a UAE/GPF there and there; but often you weren’t and it became a time bomb if the code path was in an error handler that was called infrequently.

[#] Yes, I added a conversion operator to ‘const char*’ to avoid having two overloads on every text API method or littering my code with .c_str() style warts. Choose your poison…

[$] Sadly, although my Core and XML libraries link and I can therefore run the unit tests, I can’t do the same for my WCL, COM or MDBL libraries or projects dependent on them (i.e. everything). But that’s ok because it’s the static code analysis I’m after anyway.