Sunday 29 November 2009

From SourceSafe to ClearCase to Subversion

My latest contract sees me working with a different Version Control System, this time it’s Subversion. Quite how I’ve managed to avoid knowing anything about Subversion, given its prevalence within the Open Source community, is probably quite astounding. What’s more amusing is that it seems I’m joining the party just as everyone else is leaving - the cool kids are all at Git, Mercurial & Bazaar - Subversion seems to be so last year. Mind you I’ve never used CVS either, so I have no idea of the problems Subversion was supposed to solve, and from what I understand it’s still growing within the Corporate sector, no doubt because it has reached a certain level of maturity and sits right between the other two VCS’s mentioned in the title.

My Background

I started out working at a company that used PVCS, and although I was vaguely aware of branching and merging and some of the other SCM activities it didn't really sink in as my head was swimming with so many other new Software Engineering concepts. It wasn’t until a few years into contracting and I had started using Microsoft’s Visual SourceSafe in a much larger team that some of the concepts started to make sense. In the late 90’s I began working at a company that used their own in-house VCS. Not unsurprisingly this was fairly simple and didn’t support even basic features like branching, merging and labelling. I (some would say foolishly) introduced them to SourceSafe, which due to the small team size was perfectly adequate, and it served the team well - I think it still does over a decade later. In contrast my next shift in company took me to a very large corporation, for which ClearCase was the tool of choice. It was using this mighty piece of Enterprise software that the concepts of baselines, branching, merging, labelling etc really took shape as my team was often working on 5 or 6 major development streams simultaneously. Yes it’s slow, but it’s object model and flexibility were so far in advance of SourceSafe that I couldn’t see why anyone would need anything else. Of course I wasn’t paying the licensing fees either :-) Still I know a fair few people that loathe ClearCase (especially UCM) and now I’m working with something more middle of the road like Subversion, I can see why…

Some Comparisons

There are probably hundreds of differences between these 3 products, and I’ve only spent a few weeks with Subversion, but that’s been enough to give me an idea of how the run-of-the-mill tasks work. So this list is really only the major points that have caught my attention, perhaps because they are radically different to the others.

1: Tools

The word visual in “Visual SourceSafe” pretty much sums up the target audience of VSS. It is integrated into Visual Studio and has a separate Explorer-like tool for manipulating the repository outside VS. There are some command line tools, but personally I’ve found them of little use; however I’ve never done any really serious automation with them to be fair. The ease with which you can do a recursive import makes light work of adding 3rd party libraries and is the one feature sorely missing from ClearCase.

That said, ClearCase is nearly all things to all [wo]men. It has a GUI based Explorer-like tool which makes working with local copies very simple. It also has a plethora of other GUI and command line tools that allow you to manipulate and mine data such as version histories, merge paths etc, with some effort. Naturally it’s a complex product, but when you get your head around the object model and the query syntax it can perform some impressive feats – albeit slowly :-)

Being Open Source, and knowing the Movements general attitude towards power and command-lines I wasn’t entirely sure what to expect from Subversion. I was faintly aware of TortoiseSVN but have had bad experiences with poorly written Shell Extensions* in the past - I disabled the ClearCase one due to similar issues. On the contrary it’s been pretty solid and a very different way of working with the repository than an Explorer-like view. Not that I need to use TortoiseSVN that much because AnkhSVN provides excellent Visual Studio integration.

2: Branching & Merging

Although I’ve worked with VSS for well over a decade and I’m aware that it supports branching, I’ve never actually done it in earnest. The main problem with SourceSafe is that it doesn’t support versioning of directories. This means that although you could label files and revert changes, you can’t do the same with folders, and you can’t use a label to recreate a previous release if there have been any deletions. The way I use VSS for my personal work, and how I’ve used it in the past within teams, is to effectively make all changes on the trunk – no feature or release branches, period. Trying to grok branching using VSS as a tool is never get to get you far and I guess that’s why we never did it :-) Merging was limited to those occasions when more than one developer had edited the same file on the trunk simultaneously, which was rare and changes hardly over overlapped.

ClearCase, for me, follows the most logical model. Each file/folder (or ‘Element’ as ClearCase calls it) can have many versions on many branches. What distinguishes it from the typical Subversion model though is that each file in your workspace can be selected from multiple different branches. When you define your workspace you are effectively selecting the version for each file and folder, using branches as common grouping mechanism, i.e. the branch is a selection mechanism within the workspace, not the defining characteristic of the workspace, although that is often the desired effect. However, it’s not branches, but labels, that provide the most common baseline definition for a development stream – and those labels could tag versions from many branches. Subversion talks about cheap copies when branching, but ClearCase avoids even those cheap copies. Of course there is a trade-off and I guess it is in the workspace definition.

I didn’t get the Subversion model at first – just creating a folder with the name of the branch under another folder called ‘branches’ and then copying the files into that folder - it sounds bizarre after using ClearCase. The magic of course is in the implementation. The copies are really just symbolic links to revisions of the original files. Labelling is a similar affair, it’s just a convention that you don’t modify the files after copying. The consequence seems to be that merging is conceptually more complex because you have to do two different actions depending on whether you’re refreshing your branch from the trunk or integrating back again – ClearCase doesn’t care either way.

3: Local Workspaces

Not unsurprisingly SourceSafe again has the simplest model where you map folders in the repository to folders in your local file-system. Usually you just map the root and leave the subfolders to follow the same hierarchy, but if you’re using branches I guess you can map each branch to the same local folder for convenience. I don’t believe that you can have multiple workspaces on the same portion of repository because the mapping is stored in the configuration file which is in the repository. SourceSafe also marks all files read-only by default meaning that you have to “check-out” the file before editing. In single-editor mode this locks the file, but for multi-editor mode it just remembers the version so that it can merge the file on “check-in” if required.

ClearCase requires a similar check-out/check-in pattern to SourceSafe and write protects files as well. The consequence of this model though is that files you edit without ClearCase’s knowledge are deemed to be “Hijacked” and you have to resolve the issue either by checking-it-out or reverting the changes. Unlike VSS though you can have as many workspaces as you like, all with different configurations determined by a “Profile”. In reality, to ensure consistent views across developers and the build machine, you configure a Master Profile for each Branch (or Development Stream), and all your developers use that. There are two big problems with ClearCase views – the speed of updating and the lack of atomic commits when checking in multiple files.

Once again Subversion makes a departure from the other two and does something different and I’m not sure yet whether I like it or not. A checkout in subversion is the creation of the entire workspace, and all files are writable by default. This means that Subversion determines what’s changed by just looking for files that have been modified. This removes the whole check-out palaver, but at the expensive of not being warned when someone else has edited a file and you need to refresh your workspace. My team has been checking-in and updating as frequently as possible which minimises this, but it has caused problems when AnkhSVN updates a Visual Studio project file and a merge conflict causes it not to load. The other thing I miss is the Explorer-like tool. The TortoiseSVN shell extension is nicer than the command line for many tasks, but it doesn’t feel quite as integrated as the VSS and ClearCase tools.

Conclusions

I use VSS at home for my personal archive (it has over 12 years of history in it) and it only has to support a team size of 1, which it just about copes with. But as I begin to understand more about The Software Development Process I want to try out techniques at home that I use at work, and for that something like Subversion (or more likely the modern cool tools like Git or Mercurial) would be a far better fit. And of course they’re free. But I still look back with fondness on my time using ClearCase, a behemoth of a version control system, but one which provides huge possibilities and great satisfaction when you finally ‘crack it’.

[*I’ve tried some SourceSafe shell extensions in the past and they’ve often succeeded in locking up Explorer entirely when there are transient network issues with a LAN based repository. I’ve also experienced the same thing with ClearCase, often when the license server was misbehaving.]

Sunday 22 November 2009

C# + C++ + C++/CLI – Context Switch Hell

During the last week I’ve been doing some interop development at work. As you might expect from a financial institution the core analytics are written in C++ which is awkward for my team as our system is entirely C# based. Fortunately I got the chance to decide how we were going to communicate with the C++ library. Although new to C# I am aware of the various interop mechanisms available and went through them one-by-one to see which was the best fit. The C++ library only exposes a couple of functions and those only take an argument or two which take the form of a custom type that’s not a million miles away from a DataSet. Luckily there was already a Wrapper Facade available for this type so it really just came down to what was the easiest way to marshal this structure and deal with any exceptions.

C API

C is the interop language - every language provides a C binding so it’s always a pretty safe bet. The P/Invoke mechanism built into the CLR was designed to allow interop with the underlying C based Windows Operating System, so there is a huge amount of support in the language for performing interop using declarative programming. Unfortunately I was calling a C++ API and I didn’t fancy the highly fragile option of binding to mangled function names, plus I still had to pick a C compatible type to serialize the data into. This meant either convincing the owners to expose (and more importantly maintain) the API themselves, or me writing a C based shim myself.

COM Inproc Server

Another common technique is to provide a COM based facade, perhaps via a Dual interface. If I was exposing a set of C++ objects this might be a good fit, but as it’s just a couple of free functions this seems like overkill. Plus the marshalling cost would have been significant as I would have marshalled the structure as a BSTR to get it over the COM interface and then constructed another copy on the C++ side. Given the volume of data being passed and the additional interop layer created by the Runtime Callable Wrapper this felt like a premature pessimisation, and a lot more  work.

C++/CLI

The last choice I considered was similar in vein to the C based DLL, but using C++/CLI instead. I had a little play with the Managed Extensions when they first appeared and the code looked pretty ugly then, but as we are using VS2008 now I had the opportunity to use the far more palatable C++/CLI instead. You can freely mix-and-match native C++ and managed C++, even within catch blocks, so this looked pretty enticing on the exception handling front.

Stick With What You Know

As far as I could see it was a toss-up between exposing C++ code via extern “C” functions, marshalling the data as c-style strings and having to learn the P/Invoke syntax, or learning the managed C++ syntax and creating a mixed mode DLL. I really liked the idea of being able to mix native and managed C++ in the same translation unit as this would allow me to write one set of try/catch blocks to handle any native or managed errors which simplify the code. Also I felt that given the performance bias of C++ I would have far more opportunities to pass the large amount of data most efficiently with RAII to back me up.

A Tale of ., ::, ^, –> & ()’s

The following few days coding were somewhat amusing as I context switched constantly between the three slightly different C based syntaxes of C#, C++ and C++/CLI. Each is so incredibly similar to the other that I don’t think I managed to write a single line of code 100% correctly first time :-). The following is a list of some of the subtle differences that I ran into:-

  • C# uses the single keyword ‘foreach’ whereas in C++/CLI it’s separated into two as ‘for each’.
  • C++ and C# use the ‘new’ keyword whereas C++/CLI uses ‘gcnew’.
  • In C++ and C++/CLI you can invoke a no argument ctor without parenthesis, but in C# you have to provide them.
  • When qualifying types with the enclosing class or namespace you use ‘::’ in C++ & C++/CLI, but just a ‘.’ in C#
  • C# uses a ‘.’ to invoke methods, but C++/CLI follows C++ and uses the ‘->’ pointer syntax. This is made worse by the use of the term ‘reference type’ to refer to a type that you invoke with pointer, not reference syntax in C++/CLI!
  • Forgetting the ^ either on the collection contained type, or the collection type itself.
  • The empty reference type is called ‘null’ in C#, ‘nullptr’ in C++/CLI and ‘NULL’ in plain C++.
  • Conditional compilation uses ‘#ifdef’ or ‘#if defined’ in C++ and C++/CLI whereas C# uses the simpler ‘#if’

It only amounted to a couple hundred lines of code in total across all three languages, but I felt like I spent more time staring at the compiler output window than the text editor…

Thursday 12 November 2009

Adjusting to the C#/.Net Ecosystem

So I’ve been on The Dark Side (as my C++ biased ex-work colleagues call it) for a little under 2 months now. Most of the first few weeks were spent going over the potential architecture and doing a little up-front modelling to decide what areas, if any, needed prototyping. Now I’m finally into the real thrust of C# development and as a consequence various new names have entered my consciousness. The old stalwarts of the C++ world such as Andrei Alexandrescu, Scott Meyers and Herb Sutter have been consigned “To Tape” to make room for the new kids such as Jon Skeet (and Tony the Pony, natch), Bill Wagner, Udi Dahan and Ayende Rahien. The first two are authors of books I was advised to seek out as they seem to be treading the same path as Scott Meyers and Herb Sutter, but in the C# space. Udi is someone who’s articles I have read on previous occasions (such as in MSDN Magazine) and works in a not too dissimilar field. Finally, Ayende has a blog that seems well read. It was his opinions on Unit Testing that caught my eye during the height of the Duct Tape Programmer catfight. I’m hoping his blog will give me further leads and a good background in the popular tools, technologies and sites that a long-standing C# dev will have already acquired.

Although C++ (and Java which I’ve also dabbled with on occasion) is pretty similar to C#, there are a number of gaffs I’ve made, and continued make, right out of the stalls. The most common so far is forgetting to actually allocate the object and consequently being assaulted with a null pointer exception the moment I run a unit test,

Dictionary<string, string> configuration;

In C++ this would work, but in C# I have to remember to do this:-

Dictionary<string, string> configuration = new Dictionary<string, string>;

Well, not quite, because as the compiler will then remind me, I have to append parenthesis, even though I’m not providing any arguments:-

Dictionary<string, string> configuration = new Dictionary<string, string>();

I seem to have been fortunate enough to have started C# at v3.0, with Generics already well established. I don’t pity those poor souls who had to cope with the v1.0 collections. Anyone, such as myself, who had the misfortune to use MFC (or some of the other big frameworks) before the days of a C++ compiler that supported Templates will feel your pain. Actually there are probably many in the embedded C++ world who are still suffering…

Nice though Generics are (and I have only used the collections so far) what I really, really miss is to be able to Typedef an instantiation* to avoid the DRY (Don’t Repeat Yourself) spots. Of course, in C++ the template type names pervade the code base because of the iterator model; and not using typedefs is asking for impenetrable code – especially once you have nested containers or introduce std::pair.

The other main area that I’ve been getting used to is the tool chain. Naturally we’re using Visual Studio 2008, which I’m somewhat comfortable with having used it since v1.0, in fact I still use the Visual C++ 1.0 key bindings (although they call them Visual C++ 2.0 in the UI). For unit testing we’re using NUnit which is a pleasure after many years of macro abuse. The same goes for the mocking framework – Rhino Mocks. I’ve only ever hand-rolled mocks before. Well, technically most of them were just stubs because hand-rolling is so tedious. There is a list of other stuff which I’ve yet to get my teeth into that completes the full development process, such as, CruiseControl, FxCop, StyleCop, NCover etc. When it comes to C# (and Java is probably the same) there is a plethora of tools out there, much of them Open Source, and trying to compare them and make an informed choice about what to pick for the long road ahead is difficult. The main barrier to those kind of tools in the C++ world has historically been cost as a recent comment to an earlier post of mine will testify.

Refactoring is something that C# lends itself to very nicely and although I’ve only briefly played with these features in Visual Studio 2008 (and have a Resharper license waiting for me) I feel I’m going to enjoy using them. I’m holding off on Resharper until I know how to do things myself first as not every client uses the same tools and Visual Studio itself may satisfy most of my needs. One prevalent C# practice that is frowned upon in C++ is the use of ‘using’ directives** to bring entire namespaces into scope and Resharper appears to promote this practice. The reason is no doubt due to the different use of namespaces in C#. They tend to be much longer (unlike std and boost) and partitioned much smaller (again unlike std) so the chance for conflict is presumably far less. It still feels dirty.

Although I’ve only scratched the surface of the language and tools so far, I’m already beginning to feel at home.

 

*I know they aren’t called instantiations in C# and that the compilation model is different to C++, but the name for what a generic is where the closed types are known escapes me at present. Sorry Jon, all I can remember from your book at the moment is the names Open and Closed as referring to the types at definition and instantiation.

**That’s the C++ name, I’m not sure of the terminology of using declarations and directives in C# yet. I shouldn’t have brought my 3G dongle with me!

Monday 9 November 2009

Standard Method Name Verb Semantics

Something that I’ve always taken for granted when programming is that certain verbs imply certain behaviours. I thought it would be useful to find a set of these on the web and link to it on our development wiki, as although I think it is common sense to native speakers, that is not true of team members for which English is a second language. However I’m struggling to track one down. I’ve seen plenty of posts about the format of interface, class, method and variable names, but not one that attempts to list the expected semantics of the really common verbs we use like get, create and find. To date I’ve only come across the PowerShell documentation as they’ve always made a concerted effort to try and standardise both the format of their commands:- verb-followed-by-noun, and also the semantics of the most common verbs.

One other useful resource I discovered was on Ward Cunningham's Wiki. There is a fair bit of discussion about naming conventions (across the board, not just methods) culminating in a concept know as Language Orientated Programming. The Wiki also attempts to establish a lexicon of common terms based on some existing naming conventions, such as the ‘str’ prefix used in the C runtime library for all the narrow version C-style string manipulation functions. However there is a major bias towards abbreviations in the pages I read which is not what I’m looking for.

So, until I find that resource I’m going to present my own list of common method name verbs and describe what connotations each one conjures up in my mind:-

Create & Delete

These should be an easy ones :-) Creation is about making new things, commonly objects, and are often used to implement the Factory Method or Abstract Factory patterns. The objects they create may be fully constructed, but more often they are only default initialised and the caller has to perform further actions on it. A common example would be a factory function for creating database connections, where the connection still needs to be opened after creation. Common alternatives would be Allocate and Construct.

IDatabaseConnection connection = DatabaseFactory.CreateConnection(connectionString);

If Create is how we bring objects into this world, then Delete is the Grim Reaper. Deletion is final and absolute, no trace should be left of the object afterwards. Any attempt to use something after deletion must surely be a logical error. Common alternatives are Dispose and Destroy.

directory.DeleteFile(filename);

Acquire & Release

Using Create in the name of a method, based on the above definition, could be seen to be leaking the abstraction. Acquire has more of a vagueness that could be interpreted only as “I’ll give you something you can play with, I may create it or I may reuse an existing object, either way you shouldn’t care”. The implication is that the object may be in a more complete state of initialisation. Using the previous example the database connection would already have been opened. Object pools are the prime example here.

IDatabaseConnection connection = ConnectionPool.AcquireConnection();

If you’ve acquired something, there is a feeling of only partial ownership. It’s not really yours to do with as you wish. This means that you’re not in a position to delete it, the best you can do is let go. Releasing suggests that the object may live on with another unrelated owner, or perhaps it’s going to be returned to storage to await a new owner at some point later. COM interfaces are the canonical example here as memory allocation is outside the remit of the caller.

connection.Release();

Open & Close

The use of Open and Close on an object is an incredibly common one, generally with real resources like files and database connections. They are very explicit actions, like Create and Delete, and also leak the abstraction to some degree. I find they are useful when you have a heavy resource that needs careful management, but you don’t feel you can hide that responsibility from the caller. Like Acquire and Release, they are slightly more woolly, but whereas Acquire and Release are about construction, Open and Close are actions an object performs (maybe on itself).

connection.Open();

connection.Close();

I have rarely seen these two verbs used alongside a noun in OO, even helper free functions would use overloading on type because the noun would only repeat the type.

Find & Get

When I go shopping I often have two things in mind, there is stuff I would like and stuff I need. The stuff I would like is optional, or maybe I can choose from various brands and sizes, e.g. cereal*. In contrast the stuff I need is mandatory, maybe something doesn’t work without it, e.g. petrol. Find is analogous to the first case. It says “I think you might have this, and if you don’t I can do something else”, in short, you know that it could fail and you are willing to handle that failure responsibly. One common alternative verb is Lookup.

Image image = cache.Find(url);

Get is like buying petrol from an oil refinery. You absolutely know that they must have some, and if they don’t then they better raise an exception because if the refinery is out you don’t stand much luck getting any from a petrol station either. Get is sometimes used for construction, such as in the composite verb GetOrCreate. I don’t personally like this use, to me, Acquire is far more pleasing.

String name = product.GetName();

Get has unfortunately become such an overused verb because of its association with Getters and Setters. This has the side effect of people becoming blind to it to make property accessors read more naturally, e.g. filename.getLength() would subconsciously just be read as filename.length(). I think most people these days would be surprised to find a Get method doing much more than just returning a private member.

Request, Retrieve & Locate

The word Find is so quick to read and say, and with that I feel that it often implies a quick scan rather than a long drawn out search. For remote calls I like to try and use a verb that has for more complex connotations. Retrieve and Locate both sound like more serious work is involved in the hunt. Likewise Request always feels like something a remote service will be called upon to handle.

Book book = library.RetrieveBook(title);

Unfortunately the ubiquitous nature of HTTP means that Request is probably more commonly used as a noun, such in the inseparable pairing Request/Response.

Read & Write/Send & Receive

When  it comes to I/O a very common naming convention appears to be the use of Read and Write - as least as far as file I/O goes. However when we’re talking about network I/O the old Berkley Sockets API kicks in and and we find the use of Send & Receive favoured instead. If we’re restricted to discussing the transfer of small buffers of data this is fine, but once we begin to widen the picture to the overall topic of OO data persistence a few new terms come into  play. The posh sounding pair is Serialize and Deserialize, which I find incredibly difficult to type (like all American APIs), but I tend to think of it as transport agnostic. The lower-class sounding variant is Load & Save, which evoke memories of floppy disks for me, and I usually expect it to be applied to large objects such as whole documents.

Buffer buffer = textfile.ReadLine();

uint count = socket.SendMessage(message);

Image image = document.LoadImage();

 

I readily accept that these can be seen to be very literal interpretations - I’m a very literal kind of chap - but I think that most frameworks I’ve dealt with tend to follow along similar lines and I rarely get surprised. But in application code the story is often different, especially when working in a multi-cultural team. What would be nice is to have a global programmers dictionary and thesaurus. The dictionary could ensure that you’ve not picked something contradictory, while the thesaurus would grant you some latitude without you picking esoteric words that only Shakespeare would stand a chance of understanding.

*Cereal shopping is actually a major event in our house, and as my wife (who will no doubt cast her eye over this post) will point out, does not allow for any deviation to shop own brands or other lesser substitutes…

Saturday 7 November 2009

My New Blogging Machine

Much as I love my dear Dell XPS 1200 - it’s a lovely laptop to do development on – it’s still a little too heavy to just slip into my rucksack and carry around on the commute, even at under 2 Kg. And, given that I’m trying to make a concerted effort to do a little more blogging to improve those English skills I thought I’d check out the current range of netbooks.

As someone who stands over 6 feet tall, with hands the size of spades, and fingers fatter than some peoples forearms,I’ve never felt totally comfortable on a laptop keyboard, so I never for a moment thought a netbook would be practical (I still use an IBM Model M keyboard c1985 at home - the best keyboard ever). However the other weekend, I managed to convince my niece to let me play with her netbook just to confirm my suspicions. It’s seems I was a little too hasty in my assumptions. Putting aside the somewhat sticky keys, which I wish to point out was the result of her typing with chocolate covered fingers and not a design feature, I found it far more comfortable than I’d ever imagined.

A week or so later and here I am typing this on my new Asus Eee PC 1500HA. Leaving the plug at home it weighs in just over the Kg mark, but given that I often carry weighty tomes like the C++ IO Streams book, or Enterprise Patterns with me into work it hardly registers. The keyboard feels slightly cramped, which is to be expected but I’ve quickly got used to it. The Home and End keys are at the top-right on my Dell and I often hit Insert by mistake, whereas they are bottom-right on the Eee PC and need the Fn key as well. I didn’t think I was going like that, but I’ve already forgotten about it, in fact mapping Home,End, Page Up and Page Down on the cursor cluster seems so damn logical.

Performance is also surprisingly good. My wife bought a Toshiba laptop some years ago which was very lightweight, but it only had a 1.1 GHz processor and felt very sluggish, at least when you’re used to multi-core beasts in both workstation and laptop form. I was expecting a similar experience here, but frankly I’m gobsmacked at not only how quickly it boots up from cold, but also how responsive it is when starting applications. I may have to revisit the notion that I’m only gong to be using it for writing on.

My biggest concern is my wife though. Her Dell XPS 1310 that is her bread-and-butter, has developed another fault after only a short life – no doubt something to do with the large drop it suffered a while back :-) If she gets her hands on this little beauty I fear for it’s safety. I’m hoping it’s going to be small enough to hide at the bottom of my bag under the packets of tissues and fluff and will escape her notice. So, if a strange women comes up to you and asks if you’ve seen this netbook of mine, deny everything…

Thursday 5 November 2009

Happy 21st/12th Anniversary

This week marks a palindromic moment in my relationship with my wife. On the one hand we have been married for 12 years, as of Monday. But we've been together for much longer, 21 years as of today. As happens each year, my wife says Happy Anniversary on the 1st, and yet I'm still waiting for the 5th because to me that is our spiritual anniversary. This of course has nothing to do with my complete inability to remember names or dates; obscure numbers like hex addresses and error codes - no problem, but birthdays and appointments - no chance.

And yet she still puts up with me... Even during my six month sabbatical I never really pulled my weight as I could have. She looks after the kids, runs the house, manages the finances and holds down a job! On top of that she still finds time to massage my ego and deal with my many insecurities. Quite frankly I really don't deserve her.

So tonight at the local annual fireworks display, even though I'll have the kids with me, and be surrounded by friends, St Judith's Field will still feel empty and the fireworks will lack a certain sparkle because you're not there with us. Happy Anniversary my love.

Monday 2 November 2009

A Chip Off the Old Block

My eldest has a presentation to do for his English class at school which I managed to catch him practising. It's about the history of video game consoles. Naturally he has most to say about Nintendo's recent output given that even the N64 was out before he was born! Still, he starts out with the Atari 2600 which is good to see as it's my spiritual starting point in this industry, and we have one of those in the house (a "Heavy Sixer" no less). Of course I waited until he had finished his research before pointing out the copy of High Score!: The Illustrated History of Electronic Games on the bookshelf upstairs.

So, does he display any other tendencies towards following in his old man's footsteps? Obviously he's aware of what I do for a living (to a certain degree) and has shown some interest in programming in the past, starting with the time they were playing around with Logo at school during KS2. Consequently I nudged him towards my Java JLogo applet, which was inspired by my somewhat older nephew when he was learning Logo at that age. The school also promotes applications like Scratch, which I think is pretty neat as it's visual programming with a real sense of fun. These days text I/O is pretty dull even to diehards like me so why expect a child to find it interesting?

I've also downloaded Microsoft SmallBasic, which is nothing like the BASIC I grew up with, but I thought that it might be much easier to do something interesting with. He found a Pong example program which he could play around with. I'm also aware of a book aimed at children that uses Python as a first language which seems to have favourable reviews. I might invest in that myself as I've been meaning to learn Python. I guess once he studies Computer Science proper at school, it'll either rekindle his interest in programming or just reinforce a view that videogames are the only aspect of computing worth investing time in :-)

I remember one day, when he was much younger, standing behind me watching me relentlessly debugging some tricky code in Visual C++. All he could see happening on screen though was the yellow bar that marks the 'current source line' jumping around the screen as I entered into and out of functions. He remarked to his mum that all I did all day was "make a yellow line move up and down the screen". I suppose that's not that different from playing the Atari 2600 really…

Friday 30 October 2009

Stack Overflow Dev Days – London

On Wednesday I took a day off to go to the London venue of the Stack Overflow Dev Days tour. I wasn't entirely sure what to expect even up to the moment I arrived as I signed up without the faintest idea of what anyway was speaking about, and more importantly whether any of it would be relevant to my day-to-day work as a self-confessed C++/C# server-side kind of guy. Frankly I didn't care, for me it was going to be a day of meeting new friends, old friends (from the ACCU), and generally soaking up what's new and cool in the world of Software Development -C++ and faceless servers are presumably not cool.

Opening Keynote by Joel Spolsky

The day opened with a humorous little skit on software development practices by Jeff, Joel and the Fogbugz team and was followed by Joel in person to talk about to eternal struggle between Simplicity and Power. His argument was that although simple to use software is good for the user, fundamentally what users think they want is features, and when it matters (voting with their wallets and getting the bills paid) power matters – a lot.

Python Introduction by Michael Sparks

Joel was followed by Michael Sparks from the BBC, here to give a short introduction to Python. His example was a simple piece of Python code to aid spell checking - the same code that was referenced by Joel earlier in his keynote as being elegant. Python is one of those languages I've always felt I should learn (I dabbled some years ago) and this presentation has wet my appetite again, but only when I need to do some heavy string processing.

As an aside, I agree that the code has an elegance, but where I felt it was let down was through a few classic examples of poor variable and function naming – 's', 'edit1' and 'edit2' would have been far clearer named something like 'permutations', 'generate Typos' and 'generateTyposOfTypos'. Or did I miss something obvious here?

Fogbugz by Joel Spolsky

After a cup of java during the morning break, Joel Spolsky got to do a little sales pitch about his Fogbugz products. The pitch was pretty good as it certainly looked far simpler than the bug tracking products I've been forced to use in the past. His companies latest addition is Kiln, which adds Version Control to the mix and allows tighter integration for handling code reviews etc. It's his event so I suppose a certain amount of chest-beating was to be expected.

Android by Reto Meir

The battle for the hearts and minds of the mobile platform developers was started by Reto Meir with an introduction to Android. After a brief background of supported devices and usual market growth spiel we got to see him create a simple Hello World app and run it on an emulator. Unfortunately the choice of font size didn't help and the constant scrolling around the virtual desktop made you feel quite sea-sick. However I saw enough to get a taste for what Android's about.

jQuery by Remi Sharp

To me jQuery epitomises what Joel talked about in his keynote about Simplicity and Power and also about elegant code. I have to stand back and marvel and what I feel is a truly fine piece of technology. Although I have only done a smattering of JavaScript in the last 8 or so years, jQuery clearly illustrates how far machines and, more specifically, JavaScript engines have come. Perhaps if this is your bread-and-butter it's nothing special.

Needless to say Remi's presentation was slick, providing plenty of examples of how to harness the power of the query side with the Fluent Interface (which he termed Chaining) that drives the DOM mutation. A fair amount of the audience had used jQuery already so perhaps he was preaching the converted.

Lunch

A whole paragraph devoted to lunch? I'm afraid the only thing that marred the entire day was having to wait nearly an hour to get lunch because they ran out of food! The event was sold out months ago, so someone knew exactly how many people were attending, so how can you run out of sandwiches? I missed the start of Jeff's talk due to this mishap. I'm not sure everyone got coffee earlier either.

Jeff Atwood

Unlike Joel who clearly had something to sell, Jeff seemed to be more interested in showing what makes him tick, and how that passion has driven the Stack Overflow venture. Hearing him talk about getting excited about everything right down to configuring the hardware makes him an instant soul mate. Also his plug for the book Coders at Work had the desired effect on me as it went straight on my Amazon wish list.

Qt by Pekka Kosonen

Nokia are probably the Microsoft of the mobile phone world. Whereas the new kids on the block, Android and iPhone, are cool and sexy, Nokia has the established customer base but no pizzazz. Pekka did his best to get the audience on his side by the tried and trusted means of self-deprecation. There were some interesting demos of how you can develop without needing a locally installed SDK, and the thought of remote testing, sure seems sensible given the diversity of hardware out there. Ultimately though Qt just isn't quite as sexy…

iPhone Development by Phil Nash

Phil is a fellow member of the ACCU and I was fortunate enough to go to his presentation at this years ACCU Conference about iPhone development. However this talk is a far more polished affair as he has dropped some of the more grungy details of Objective-C development in favour of putting together a demo app live on stage. Like the Android presentation earlier, it was only a simple app, but had just a little more panache than the competition. The main stumbling block I fear though is still Objective-C, the manual memory management is a real turn off.

Humanity: Epic Fail by Jon Skeet

Being new to the world of C# development, Jon Skeet is a name that I only have a passing familiarity with at present by virtue of me having only just read his book, C# In Depth. Oh, and a few Stack Overflow replies as well. The man seems to have a somewhat legendary status and doing a presentation with a hand-made sock puppet doesn't appear to have put a dent in that.

The premise of his talk was that there is a huge disconnect between the users view of the real world and how they expect the software we develop to model it. They are simply unaware of the limitations of the digital world, such as in the representation of real numbers, or the ambiguities of time-zone names, or the subtleties involved in processing text from the many different languages we speak. Quite frankly if people like him can't get it right, what chance do those of us further down the evolutionary chain stand?

How Not to Design a Scripting language by Paul Biggar

The afternoon coffee break was succeeded by Paul Biggar, a PhD student studying scripting languages. He had some interesting opinions on where their limitations lie and how they could be improved to reduce the performance gap with the traditional compiled languages. I've never read The Dragon Book about compiler design, but I'm aware of it's reputation, and someone who publicly dismisses it in favour of Engineering a Compiler by Cooper & Torczon had better know their onions. He was definitely one of the more entertaining presenters.

Yahoo! Developer Tools by Christian Heilmann

Christian Heilmann finished off the day promoting the YUI toolkit from Yahoo! I got the impression from this talk that there is fair bit of competition in the web toolkit arena, and their USP is that they use it and that they have 330 million customers to support. That customer base figure was trotted out a number of times… The demo of YQL, a SQL like query language for extracting data from web services, was certainly very interesting. I would have thought that YQL plus jQuery would be a very powerful combination.

Final Thoughts

Joel returned to the stage to close the day and asked who would come again. Most, including myself, raised their hands. At £85 it was certainly a bargain compared to what you normally pay for training – more so if the subjects were relevant to you. Me, I just enjoyed the day surrounded by geeks who are passionate about their profession. Now if we could just convince a few more to join the ACCU…

Monday 19 October 2009

DDE Is Still Alive & Kicking

Dynamic Data Exchange (DDE) is an ancient inter-process communication (IPC) mechanism carried over from the 16-bit Windows days. Post millennium Windows programmers are probably more used to a diet of COM, but there still seems to be life in the old dog yet. Maybe there are less people around to answer questions on DDE, or perhaps the other old timers are ignoring them in the hope they'll go away because I seem to have had more emails on the subject this year than ever.

I presume the reason the questions are still coming my way is because I have a number of freeware tools on my website aimed at working with DDE Servers that are still actively supported:-

  • DDE Query is my oldest tool and is a GUI based utility for sending requests and creating advise loops on items. It was written as the main test harness for my DDE library.
  • In contrast DDE Command is my most recent addition and is the console based counterpart to DDE Query. It also provides the ability to invoke XTYP_EXECUTE transactions.
  • In between the two is my DDE COM Client - an automation compatible inproc COM component that can be used as a DDE Client for scripting scenarios, such as VBScript.
  • The final, and second oldest utility I provide, is a Network Bridge to allow DDE Servers to be accessed remotely. It is entirely transparent to the client and server, unlike the built-in Windows NetDDE service.

These all use my own C++ DDE library which in turn is based on the DDEML C-API library that Microsoft provides. Under the covers DDE is just a message based protocol that uses Windows messages to encapsulate a connection between two applications – known as a Conversation. Data is passed by allocating it with the old GlobalAlloc() API, and the format is determined by either using a standard clipboard format, such as CF_TEXT or a custom one via RegisterClipboardFormat(). The fact that it is message based is also its biggest limitation as it means you cannot share data between machines – you can't even share data across desktops on the same machine. The only book I came across on the subject was Ole 2.0 and Dde Distilled by Al Williams, but it covers the topic pretty thoroughly.

Before working in finance the only time I had come across DDE was when writing an installer back the mid 90's. Under Windows 3.x you used DDE to communicate with Program Manager (the shell that today is known as Explorer) so that you could create a "Program Group" and the icons for your application. You can still see this legacy today if you fire up DDE Query and use the "Server | Connect…" option, where you will spot a server called PROGMAN with a single topic also called PROGMAN. If you open this conversation and request the item "Accessories" you will be shown a CSV formatted text block with the programs from the Accessories Start Menu folder.

Once I started working in finance I discovered that Excel was the traders tool of choice. Excel can pull data from a number of sources, but the legacy option for real-time data was DDE. The big providers like Reuters, Telerate & Bloomberg all provided tools to allow you to feed their financial data into a spreadsheet and the in-house software we developed followed the same architecture. Although COM was probably Microsoft's promoted technology, DDE felt far simpler to implement.

These days I don't work with that kind of real-time feed, but the questions I do get on DDE always seem to have RIC's in the examples, so I guess that it's the financial industry that is keeping this prehistoric mechanism alive.

Saturday 17 October 2009

We Don’t Use IE6 Out of Choice

Whilst on site the other day I found myself wanting to check out the NHibernate Profiler website. When I arrived at the homepage I was greeted with a modal iframe that pointed out that I was using that old legacy browser we all love to hate – Internet Explorer 6.

This distraction is nothing new, many popular sites such as Twitter also display a little message that points out the error of your ways, but what really raised my ire was the approach NHProf has taken. Instead of just warning you that your browsing experience is likely to be sub-optimal due to your choice of browser they instead hide the site behind a semi-transparent modal iframe that just includes links to download more modern browsers such as Firefox or Chrome. Nowhere was there an option to allow me to accept the consequences and soldier on, knowing full well that content may be all over the place or unreadable due to poor image handling.

When implementing this behaviour did it ever occur to the developers that I might not be using IE6 out of personal choice? Or that upgrading my browser at that moment in time might be incredibly inconvenient?

There is a very good reason why Internet Explorer 6 is still popular among the development community and it has absolutely nothing to do with laziness or some twisted superiority complex – it's Corporate Policy. Large corporations are incredibly conservative when it comes to upgrading major software components like the OS, browser or office suite. This is likely supplemented by a very strict policy controlling what additional software can be installed to reduce the possibility of conflicts. They often have a significant number of line-of-business (LOB) applications that have been developed in house that could potentially break if one of these elements is changed – a move that could cause them significant financial loss. I know of one major financial institution that didn't start rolling out Windows XP on the desktop until 2007 at which point XP was close to entering what MS calls Extended Support. The remediation process was lengthy, tiresome and provided no added value to the business.

I completely understand that website development is hard enough given that you have to test with IE, Firefox, Chrome, Safari, Opera etc and adding support for a badly broken browser like IE6 into the mix adds significant cost – especially for a non-commercial venture. But I'm suggesting they do. I just want you to recognise that we're not all in the privileged position of being able to treat our machines as we please. Some of us have to put up with our desktops, CD drives and USB ports being locked down tighter than Fort Knox.

Trust me, no one would still be using IE6, if they (or even Microsoft) had a say in the matter…

Wednesday 23 September 2009

The Perils of Intellisense

Work on my WMI library has been a little erratic during my recent time off, but I came across a bug recently which I lay squarely at the door of Intellisense. Of course, it's really a case of user error, but Intellisense lulls you into a false sense of security...

I had just started the library and added the initial unit test for creating a connection. I decided that I could get away with talking to the WMI service on the local machine as it's pretty much a standard piece of Windows technology and the unit tests would still run quickly. I started the implementation in the Connection class by adding the some private typedefs and members:-

typedef WCL::ComPtr IWbemLocatorPtr;
typedef WCL::ComPtr IWbemServicesPtr;

IWbemLocatorPtr m_locator;
IWbemServicesPtr m_services;

As one would expect, Intellisense did it's thing during the typing of the typedefs by showing me a list of types, and after typing just "IWbem", I could see the ones I wanted listed and so went with them.

In the implementation file I proceeded to write the Connection::open() method by declaring local variables for the underlying Locator and Connection using the typedefs declared earlier. The WMI Locator is the root object and is a singleton so requires no arguments during or after its construction. Once again Intellisense shows its productivity enhancing ability by showing me a list of relevant CLSID's after I type the initial portion, which I know to be "CLSID_Wbem". Seeing a CLSID that has 'Wbem' and 'Locator' in it's name I dutifully click it and get on with the hard part of actually writing the method logic - after all I "Lean on the Compiler" as it's much better at spotting errors than me...

IWbemServicesPtr services;
IWbemLocatorPtr locator(CLSID_WbemAdministrativeLocator);
WCL::ComStr path(host + nmspace);

HRESULT result = locator->ConnectServer(path.Get(), nullptr, nullptr, nullptr, 0, nullptr, nullptr, AttachTo(services));

if (FAILED(result))
throw Exception(result, locator, TXT("Failed to connect to the local WMI provider"));

m_locator = locator;
m_services = services;

This works a treat - my unit test passes. I can open and close a connection, so without delay I get on with writing the next set of tests and code to perform a simple WMI query.

After a few distractions, such as going on holiday, I decide to start implementing my WMICmd tool which is a simple command line tool for executing WMI queries. It will also do very nicely as a vehicle for thoroughly testing the WMI library. I get the shell of the application up and running and add support for running a query. The first query is the same as the unit test, and it works. I do some work on the output format and try a few other simple queries for good measure. It's working a treat and so I move onto more the more useful features like being able to query a remote computer. I add the command line support for providing a remote host and give it a whirl...

It fails. The error is "Invalid Parameter". I read the MSDN help and surmise that maybe you can't use 'localhost' as it suggests you use '.' for local connections. I need to allow a separate login and password to be provided anyway so I skip straight onto the full solution by refactoring the Connection::open() code to allow a username and password to be provided. I don't have any unit tests for this (for obvious reasons) but the other tests pass, so I know I haven't broken anything. It sill fails. Huh? "Invalid Parameter" again...

I guess that I've forgotten something COM security related, perhaps I need to call CoInitialiseSecurity() - but I'm sure I don't. I know it can't be the call to CoSetProxyBlanket() as that comes after. I pull Keith Brown's "Programming Windows Security" off the bookshelf in search of enlightenment. Nothing obvious. I try a few 'random tweaks' in the hope of getting a different error. Still nothing. I go over it in my head again and again - "Invalid Parameter" means I must have got one of the arguments to the ConnectServer() method wrong. So I carefully read the documentation a number of times and this raises a few questions about my assumptions in my implementation. But none of them are the cause.

I stare at the code for ages trying to work out what to do next. I compare it line for line with the example code in the WMI SDK documentation. Only I don't. I've been skipping the 'trivial' initialisation code before the call to ConnectServer(). Finally I decide to double-check the CLSID for the Locator COM object and I spot a difference... It's not CLSID_WbemAdministrativeLocator in the example, it's CLSID_WbemLocator! I make the relevant code change...

IWbemServicesPtr services;
IWbemLocatorPtr locator(CLSID_WbemLocator);

I run the unit tests. Good, they still pass. I run my WMICmd tool, and bingo, it now works. I go back and try 'localhost' to prove to myself that I was obviously wrong with my assumption about having to use "." for a local query, and of course it also works.

So, by accident, I've been instantiating the wrong COM object, and that object just happens to implement the interface I need - IWbemLocator! As Harry Hill would say "What are the chances of that happening?". I'm blaming Intellisense for the hair that I've torn out trying to fix this issue, but that's not exactly fair. I could have cut-and-pasted the code from the WMI documentation, and we all know how fallible documentation is. I'm sure the fact that I've implemented something like this before for a client some years ago lead me to become complacent. And the unit tests, which constantly passed, deflected me away from the existing code and instead lead me to believe I was missing something else. Useful though unit testing is, I need to remind myself that they are not a panacea.

I've not forgiven the Intellisense window yet, but at least we're on speaking terms again...

Wednesday 16 September 2009

Going Back to Work

Back in March I made the decision not to renew my contract for a 4th year and instead to take some time off. After discussing my financial situation with my Accountant (aka wife) I found I could easily take 4 months off, but probably longer if I sent her out to work more :-) In the end it's been 6 months and I feel thoroughly refreshed and ready to dive back into working life. So did I manage to achieve all that I had planned?

Well, not exactly. I had grandiose plans of trying to attain a working style day where between 10am and 3pm would be devoted to 'career enhancing activities'. I would use this time to read various blogs & books and continue to work on my personal codebase to keep my eye in. Also I intended to get into writing by starting a blog and publishing some articles in the ACCU. The good thing about your work also being your hobby is that career development is far from being a chore! The bad thing is that it's hard sometimes to differentiate the two and ensure your family and social life gets the time it deserves.

More family time was the main reason for me taking a sabbatical. Commuting takes its toll, and with only occasional working from home, I missed the little things like taking my kids to school and picking them up again. Those short periods of time in the playground at the start and end of the school day allow you to track your child's ever changing circle of friends. Although I never worked long hours in the office (preferring to get home and work remotely instead after the kids have gone to bed) I still missed the family evening meal which is a nice time to find out what's happened during the day at school.

However, much as I love my family dearly, they aren't very good at the Geek talk :-) My eldest son is on his way to becoming a hardcore techie (not through any fault of mine you understand) but it's not the same as spending all day in a team with like-minded individuals. The biggest thing I miss is the social aspect of work. Even when I broke my ankle a year or so ago and did 3 weeks working from home, I found that being at home and keeping in touch with people by phone, email and chat just doesn't have the same buzz as being in the office.

Clearly it's been a case of 'Mission Accomplished' as far as family life goes, but did I manage to achieve anything else? Well, I got the writing started by having two reviews published in C Vu, and this blog, although not not daily or weekly, is getting some erratic TLC. I've read a considerable number of books - Imperfect C++, Extended STL, Extreme Programming Explained, Implementation Patterns, C++ Gems 1 & 2 to name a few. Plus I've bought and read selected chapters from numerous others, mostly as a result of following up on ideas I've read in other blogs and on the accu-general mailing list. The rabbit-hole soon gets pretty deep doing this :-)

Naturally I have managed to get some coding done, but not quite what I had intended. I was going to write a Wrapper Facade for the COM based Windows WMI infrastructure along with a couple of WMI based tools. I've only just started on that as I was distracted by some articles on Good Unit Tests by Kevlin Henney that caused me to do some refactoring of my unit testing framework in preparation for some serious TDD on my impending WMI library. The other major interruption was caused by finding out how easy it was to get GCC to compile my main C++ libraries and that snowballed into an investigation about using it for additional Static Code Analysis.

And what to next? Well, my new position is going to be something of a departure. After 15 years of C++ I am going to be entering the world of C#. At first it's supposed to be business as usual on the C++ front, but then it's probably going to be C# all the way. I'm hoping there will be some C++/C# Interop to ensure I still get to apply some of that rekindled and newly acquired knowledge from Imperfect C++ & Extended STL :-). I'm also looking forward to working again with some colleagues from previous jobs.

I've only written the obligatory 'Hello World' program in C# so far, but I'm sure the transition from C++ to C# will provide some blog fodder in the future...

Thursday 16 July 2009

My First Published Articles

Today the postman delivered the July 2009 issue of C Vu - the magazine of the ACCU. In it features two pieces by yours truly :-) Okay, so these aren't a couple of earth shattering discoveries about modern software development, but a couple of reviews. The longer review is of the 2009 ACCU Conference back in April, and the shorter is of the ACCU London meeting in May which featured Jeff Sutherland.

I guess the reason I'm more than a little pleased with myself is that I find writing hard. This is exemplified by the fact that I got U's for both my English Language and Literature O'Levels. For those of you unaware of the English school system, O'Levels were the exams you took at 16 - now called GCSE's. You were graded from A-E, with U covering everything below an E. The standing joke from my fellow classmates was that you got an E just for writing your name on the paper! An English language O'Level of grade C or above is pretty much required for any form of higher education in England so I had to retake until I got one... and I did the following year thankfully.

Once I left University I was glad to leave all that writing malarkey behind, I thought that Computer Programming was a great career where I would never have to write essays again. Somehow I managed to avoid any sort of formal writing for over a decade, with class descriptions in UML models being the largest paragraphs I wrote.

But this all changed when I switched jobs a few years ago. The company I joined required me to write some formal documentation for the Build System and other processes I had developed. The documents were painful to write but I actually found myself enjoying it, probably because I was writing about something that I actually liked and knew about. The distributed team structure also meant that email was the main form of communication and so constant writing started to become the norm.

I still find writing very hard and so I started this blog in the hope that it would help me improve my skills further. So far I'm still enjoying the process (probably because blogging allows me to talk about, well, me :-). Perhaps I would have got that English O'Level a little quicker if blogging had been around when I was at school...

Tuesday 14 July 2009

Using Visual C++ Together With GCC - Some Alternatives

My last post described how I have started using GCC 4.4 to compile my codebase so that I gain a little more portability, but more importantly I get some extra static code analysis without spanking my wallet. My Heath Robinson-esque solution involves using the free Code::Blocks IDE as it can import Visual C++ projects and solutions with minimal fuss, they are easy to maintain, and it's only an extra .cbp and .workspace checked into the SCM repository - so it's not messy.

As I mentioned before this was a suggestion from fellow ACCU member
Steve Love, but since then I have had a some discussions on the accu-general mailing list to see what alternatives there are...

I guess the most obvious alternative would be
Eclipse + CDT (C/C++ Development Tooling) as this mirrors my current setup of an IDE + GCC. Given the volume of development in Eclipse it also probably makes the most sense, but I have not had a good experience with Eclipse or CDT in the past - probably because I've been spoilt by the simplicity of installing and using Visual C++. I recently tried the CDT 5.0.x release but once again got lost in the Eclipse menus trying to configure it for my MinGW installation. This was in stark contrast to Code::Blocks where I had it configured in minutes. I guess I shouldn't be surprised that the learning curve of a tool as powerful as Eclipse is going to be much steeper, but it's just too steep for my VC++ biased plan at the moment. I should also point out that I haven't even opened the manual, so cries of RTFM are entirely valid and I only have myself to blame.

Both the Code::Blocks and Eclipse solutions rely on a separate IDE to manage the build process which is annoying because of the additional maintenance required to keep them in sync with the master Visual C++ projects. So, is there a way to drive GCC directly from the Visual Studio IDE?

A little bit of Googling during my earlier research led to a tool called
gnu2msdev that translates the output of GCC to the same format that CL produces so that the Visual Studio IDE can navigate to lines with compilation errors. From this I discovered a number of posts suggesting the use of a makefile project, but they were more about using Visual Studio as an expensive text editor rather than compiling the same source with two compilers. It feels like there might be something possible along these lines and I'm going to continue investigating in the background.

I've never used the Intel C/C++ compiler, but another fellow ACCU member,
Anna-Jayne Metcalfe of Riverblade, who knows a thing or two about integrating with Visual Studio, pointed out that Intel have managed the tight integration, so perhaps it would possible to do the same with GCC? I've looked at the MSDN documentation and it looks somewhat complicated for now :-) However, the signs are that it will be somewhat easier in the future as Microsoft have been slowly separating the IDE and build system so that it can support other languages and toolchains. The VS Project Team Blog has a number of posts on the changes to Visual Studio extensibility in VS2010 that show what they are up to.

For the moment I'm sticking with the Code::Blocks approach, not least because the .cbp file format is simple and I reckon I could solve the sync'ing problem with a bit of script. However I'm still keen to get into Eclipse because there is so much development clout behind it. Hopefully by the time VS2010 becomes my baseline someone else would have already written the GCC integration and I'll get another free ride :-)

Tuesday 7 July 2009

GCC for the Visually C++ Impaired

At this years ACCU conference Steve Love gave a talk titled Why Portable Code?. It covered far more than just toolchains and platforms, but it reminded me of a previous chat I had with Steve about portability after one of the recent ACCU London gatherings. I've always liked the idea of writing truly portable code, but quite frankly in the corporate waters where I swim Visual C++ is The Big Fish and any ideas about using alternate toolchains to satisfy personal desires around "writing quality code" would be seen as frivolous - irrespective of whether it has long term benefits or not.

In my discussion with Steve I explained that I hadn't even looked at a Makefile in a long time (according to my personal SourceSafe repository it's 1997) and I didn't feel that I could introduce any other build system just to enable portability checks. What I wanted initially was a quick way of taking a C++ codebase and just running it through A.N. Other compiler. It wouldn't have to link or run, just allow me to check compilation. Steve's suggestion was Code::Blocks, an Open Source, cross platform IDE, that was obviously also free. It turned out to be much better than I had hoped for and has also given me a partial answer to my previous blog 'Where Are the "Lite" Editions of Static Code Analysis Tools?'.

As you can see from my earlier post 'Building Visual C++ Projects From the Command Line', I am not adverse to command line tools, but the thought of trying to use another toolchain, especially one with a Unix heritage like GCC, didn't exactly fill me with joy. The last time I had used G++ was on a twin-floppy Atari STE circa 1990 and it needed a Unix like OS to run it. Googling didn't fill me with any more confidence as I started to read about Unix emulation layers like Cygwin and I didn't want to have to learn another 'Shell' - I just wanted to be able to run the compiler from a Windows command prompt. The answer again turned out to be pretty simple - MinGW - a Win32 port of GCC for building Windows applications. Feeling cocky I also downloaded the Digital Mars C/C++ compiler as well after noticing support for it in the Code::Blocks UI.

Now, I only wanted the IDE for the build system, not the text editor. So I created a simple Hello World C++ console application in Visual C++ and then switched to Code::Blocks to see if I could get it to build. After hunting though the menus I discovered how to configure a toolchain (it had detected VS2005 Express automatically, but not my VS2003 Professional) and noticed that it had an option to import a Visual Studio Solution or Visual C++ Project. I picked the former and played around in the UI to see what it had done. Amazingly I hit build and it worked... and ran too! I discovered how you switch compilers on a project and pointed it to my MinGW 3.4.5 installation and lo-and-behold it also worked (except for some warnings about unknown compiler switches which is an artifact of switching tools). I tried the Digital Mars compiler, after also downloading STLport, and it compiled, but didn't link (I discovered later that I had to specify the stlport .lib manually). I then got hold of the MinGW 4.4.0 release, but now it all went horribly wrong. However that was entirely my fault due to a lack of RTFM'ing the release notes. It was actually a blessing as I discovered the TDM GCC builds as an alternative to MinGW and it's "On-Demand Installer" worked out-of-the-box and better suits a heathen like me.

With my new found confidence I decided to take my Core library, which is the more platform agnostic part of my framework and give it a whirl with GCC 3.4.5 (I didn't have 4.4.0 working at that point). Once again I used the import feature of Code::Blocks to configure the build and it complained quite loudly at my code, which I kind of expected :-) But all the problems were quite simple and related to my use of Visual C++ specific #pragma's in the common build configuration header files. Once these were fixed the rest of the code was pretty sound - with GCC 3.4.5 that is. GCC 4.4.0 being much newer and stricter had a LOT more to say! I am going to blog about the changes I made because I was pleasantly surprised at how little I needed to change on the structural front to enable GCC support, but also the kinds of issues 4.4.0 has raised are squarely in the Static Analysis arena and not reported by any version of Visual C++ up to 9.0 (i.e. VS2008).

I feel somewhat annoyed that it has taken me so long to truly see the light. In Steve Love's talk he identified three stereotypes - The Ignorant, The Skeptic & The Zealot. I don't believe I'm any single one of those, as I've always agreed with the principle, but I probably share traits from all of them and add in a dose of procrastination for good measure. As I write this I'm just finishing up porting the oldest part of my framework, my Windows Class Library, and as soon as that is done I'll write some posts about the discoveries I've made.

Tuesday 30 June 2009

Where Are the "Lite" Editions of Static Code Analysis Tools?

When I started out, the compiler I was using was set to build on warning level 2 (it was MS C600) which pretty much just told you if your code was well formed or not, and that was all I cared about then. Fortunately whilst working there the company discovered Writing Solid Code by Steve Maguire along with Code Complete by SteveMcConnell. One of the practices Steve Maguire suggests is cranking the diagnostic level up on your tools to maximum and leaving it there - pretty much a best practice these days. The net effect of this is to enable the static code analysis within the compiler to highlight your 'valid', but potentially dubious code, giving you a chance to fix it before it becomes a bona fide bug. Yes, sometimes it gets it wrong, but there is nearly always a way to rewrite the code to silence the compiler (preferable) or a sledgehammer in the form of a pragma to disable the warning for a small section of code.


But compilers are first-and-foremost seen as a build tool - the static code analysis abilities are a nice-to-have feature that has probably dropped out as a by-product. For full-on code analysis you apparently need an industrial strength tool, which quite probably comes at an industrial strength price. Now I'm sure these tools have taken a considerable investment and therefore demand a high price, which is fine for a big corporate client who can also afford the training required, but for smaller outfits and freelance developers the cost is prohibitive. Nowadays I can get started in C++ development using good quality free tools like Code::Blocks for the IDE and GCC for the compiler suite or even the all-in-one: Visual Studio Express. But what about when I need to take my skill further so that I can improve the quality aspects of my coding?


At various clients in the past I have used tools like BoundsChecker (now DevPartner Studio) and Rational PurifyPlus to check for memory leaks, buffer overruns, uninitialised variables etc. and they have been useful, but I don't feel it was the most efficient use of my time. They are a great way of tracking down a specific issue, but not at all the right tool for continuously sanity checking the codebase. On the other hand one of the oldest and potentially useful tools for doing this with C++ is probably PC-Lint (something I am keen to get my hands on - more so since going to Anna-Jayne Metcalfe's ACCU talk Taming the Lint Monster). And even Microsoft have got in on the act and added an /Analyze switch to its Visual C++ compiler - but only in the Enterprise edition.

Code Analysis (Static or Dynamic) is a noisy business, meaning that you need to be proficient in the tool to make the best use of it. But to do that requires some serious time with it and many businesses just wont be able to justify the cost of both the tool and time required for training. And this is normally where those "Lite" editions come into play. Freelancers (and hobbyist coders) like myself, who often have the ability to affect buying decisions would have considerably more weight if we were able to demonstrate the real benefits because they were a natural part of our toolset.

One client I worked at lost many man days due to uninitialised member variables, something I know at least one static code analysis tool points out. Personally I get annoyed when I see a bug that I know a tool could have prevented. What I really wanted to introduce into the build process was a step to run a short static code analysis job before the build and then later, as we got more proficient with the tool, add a weekly step to run a much deeper analysis. The stumbling block was that I could not even get a trial version of the product I wanted to use so that I could play in my own time to build a list of current defects to use to justify the cost.

Having the knowledge of certain products on my CV naturally helps make me more marketable, but more importantly, as a professional I feel I have another weapon in my arsenal to help ensure the quality of my code.

Monday 22 June 2009

The Default 'Size' & 'Index' Type: size_t vs int

I started out writing applications in C on 16-bit Windows. The company I worked for favoured using the Windows SDK functions instead of the C-Runtime, so for example I would use lstrlen() instead of strlen(). Unfortunately, the Windows SDK is biased towards using int instead of size_t for parameters and return values that represent counts and indices. As I moved into working on MFC based applications the habit continued as MFC has it's own container types, such as CArray, which also favoured int, as does the popular common controls, like the CListCtrl. The final bias towards int was the use of -1 as an error code when using controls such as the combo and listboxes through constants such as CB_ERR and LB_ERR.

Unfortunately this habit got carried over into
my own framework because I also had my own string and container classes and my wrapper facade is very thin in many places. So it wasn't much of an issue until I finally junked my clunky containers and started using the STL in preference...

All of a sudden the compiler was constantly complaining about signed/unsigned comparisons which would ripple through my code as loops where changed from,

for (int i = 0; i < v.Size(); ++i)

initially to,

for (int i = 0; i < v.size(); ++i)

and finally to,

for (size_t i = 0; i != v.size(); ++i)

(Changing the code structure to use iterators instead - the correct solution - was going to be an exercise for another day.)

I realised that a large part of my codebase was int-based instead of size_t-based and the number of uses of static_cast was growing and making the code even uglier so I decided to bite the bullet and go size_t across the board. The entire codebase isn't massive (160,000 LOC or 45,000 SLOC according to the excellent
Source Monitor) and it took a few train journeys but it felt good. However one subsequent annoyance was with the comparisons to CB_ERR etc as they are just #define's for -1 so I followed the STL string type (which returns -1 as a result for some of the find() methods) and declared a global constant,

namespace Core
{
static const size_t npos = static_cast<size_t>(-1);
}

In retrospect I realise I should have declared an 'npos' in each class, such as CComboBox and CListBox, instead of globally, but there was quite a bit of code that just had "= -1" as an argument default and I got lazy trying to resolve all the issues quickly.

Now my codebase felt more wholesome because of the large scale refactoring of int to size_t and I felt more aligned to the C++ world. So I decided to see what happened when I ran the 64-bit cross compiler from the Platform SDK over it....

Yup, lots of errors about truncation from a 64-bit value to a 32-bit value (I always compile with /W4 /Wx) , i.e. conversions from a size_t to an int or DWORD! It appears that functions like GetWindowText() still traffic in int's and many of the I/O functions that specified sizes for buffers still use DWORD's. So, more static_casts went in.

And now the feeling of wholesomeness is gone again as I contemplate the impedance mismatch between the Windows API and the C/C++ standard libraries. My framework is largely a Wrapper Facade and is therefore thin, but I don't believe it should expose the use of int or DWORD to its clients. I also have use of my own 'uint' type in various places where I knew an int was undesirable, but that now also requires a cast when the source is a size_t. I even started a thread on one of the
ACCU channels to see if other people felt that size_t was overused and if it would be a good idea to at least use an index_t typedef for [] operators and index style parameters.

For now the possibly excessive use of size_t stays whilst I chew the fat. I also want to deal with the nastiness that WPARAM and LPARAM has created in the framework due to the documentation using these types without describing what a suitable type would be (e.g. the Character Code - for WM_CHAR) as I want to use richer typedefs instead of the limited set the Windows API uses (e.g. WCL::TimerID instead of UINT_PTR).