Monday, 10 September 2012

Maintaining VCS History Across Refactorings

Refactoring is seen as an essential tool in the ever increasing battle against software entropy, and the most common variant[*] I do day-to-day is Extract Method. The other two refactorings that make more than an occasional guest appearance are Extract Interface and Extract Class. The former usually comes about when trying to mock legacy code whilst the latter is the result of applying Extract Method too many times and missing the proverbial wood because so many trees have sprung up to hide them.

The other thing I tend to spend more than my fair share of time on is Software Archaeology - digging through the Version Control System to look back at the evolution of a class or file. Just the other day I was trawling the archives trying to work out why a database table had a particular primary key in the hope that a check-in comment might help explain the decision; it didn’t this time, but so often it does.

One side-effect of the relentless use of refactoring to improve the codebase is that it creates a lot more revisions in the VCS. I’ve written in the past about the effects of fined grained check-ins (on an integration branch) and for the most part it’s not too much of an issue. The most common refactorings tend to cause changes that remain within the same source file and so from the VCS’s perspective it’s easy to manage. Where the VCS does struggle is with those refactorings that involve extracting a definition or behaviour into a new entity, i.e. a new interface or class. This is because we tend to store our code as one interface/class per source file. Yes, nested classes can be found in the wild occasionally, but in my experience they are a rare breed.

The most common way to perform an Extract Class refactoring is to create a new source file and then cut-and-paste the code from one to the other. If you’re a modern developer your IDE will probably do the donkey work for you. The Extract Interface refactoring is very similar, but this time you copy-and-paste the method signatures to a new file. In both cases though you have created a new source file for the VCS to track and from its perspective you might as well be creating a brand spanking new entity because it can’t see that logically you have moved or copied code from one source file to another. What’s a poor developer to do?

The way I handle these two refactorings (and any other change where a new source file is logically seeded from an existing one) is to clone[#] the original file in the VCS, then paste the correct contents in so that the resulting file is the same as if starting from scratch. The difference though is that at commit time the VCS has the extra metadata it needs to allow you to walk directly back in time to the code before the refactoring. In the case of Extract Class this means the evolution of the methods prior to the refactoring are readily available without having to use check-in comments (or worse still “just knowing” where to look) to find its past lives.

I have no idea if any IDEs or refactoring tools have support for this way of working. From the little I’ve seen (I tend to refactor manually because they never put things quite where I want them) they seem to just drive the IDE rather like a macro. Please feel free to set me straight if this isn’t the case.


[*] The name comes from Martin Fowler’s excellent book on refactoring.

[#] In Subversion this is just a COPY, which is essentially what every operation in SVN is - branching, labelling and even renaming.

No comments:

Post a Comment