Tuesday, 20 September 2011

Copy & Rename (Like Copy & Swap But For File-Systems)

In C++ one of the most popular idioms for writing exception safe code is the “copy & swap” idiom. The idea is that you perform all “dangerous” work off to the side and then “commit” the changes using an exception safe mechanism[*] such as std::swap[#]. The idiom is not exactly intuitive because what you want to do most of the time is commit by assignment, but that will likely cause a copy and therefore open you up to the possibility of an exception (e.g. out of memory) which is exactly what you’re trying to avoid. In contrast the implementation of std::swap (assuming it’s been implemented efficiently for the type in question) is usually very simple and just swaps some internals members. The swapped out value is then cleaned up later.

In the world of distributed systems it’s very common to receive data in “flat files”, published by some external system that you then consume as inputs, perhaps during some batch process, before publishing your results to someone else. These files can often be quite large and take a non-trivial amount of time to write by the publishing system and so it’s important that you don’t start reading them before writing has finished[^]. If you do happen to start processing a file before it’s completely written you can end up with some real head scratching moments as you try and diagnose it later (after writing has then finished and the file is intact!).

The solution is to something akin to the copy & swap idiom, but using a file rename in place of std::swap to commit the change. The rename is just a metadata change and so torn files are avoided. It will also be safe in the face of a potential out-of-disk-space scenario as no “visible” data will be committed if the write terminates abruptly - the subscriber always gets all or nothing.

File renaming can also be used in a manner similar to the atomic CompareExchange function. When writing a service that caches data (in the file-system) and you get two simultaneous requests for the same data you have the potential for a race condition as both could try and update the cache together. What you can do is to let both continue as normal but write their data to a temporary file or folder. Then you “commit” by renaming this temporary file/folder to its final name at the end. The winner gets to be oblivious whereas the loser gets an error on renaming to signal it lost and has to cleanup before returning results; possibly out of the winners cache. If at any point an error occurs the temporary cache can be cleaned without affecting other requests.


[*] Yes it’s possible that std::swap could generate an Access Violation (or the equivalent thereof) and still cause corruption but that is no doubt considered to be covered under the realms of Undefined Behaviour.

[#] The new C++ standard introduces the concept of r-value references and std::move() which should make for a more intuitive solution in future.

[^] Assuming the file share mode allows it and you have a format that supports it you can do this but you’ll be adding a lot of complexity that you probably don’t need or can avoid through other means.

No comments:

Post a Comment