Legacy Code: Treat it with Silk Gloves

February 12th, 2012

Legacy code is tough to deal with. It may be the toughest thing we have to deal with in our everyday jobs.

Before we start, let’s align our definition of what legacy code is. Legacy code is untested code. Legacy code may be old, but not necessarily. Unfortunately some projects are actively producing legacy code.

Most programmers have to work with legacy code frequently. Even worse, they often must make changes in the legacy code. They must fix bugs in the legacy code, or change it to implement new features. At the end of last October I started working on a project that at the time, fully consisted of legacy code. I won’t share any specific details about that client and their projects, but I like to share how we work with the legacy code.

First of all, as a general rule, we try hard not to change the legacy code. If a feature can be implemented without changing the legacy code, implement it with completely new code. And of course, write it using TDD.

Only if this first rule does not apply - i.e. when we must change the legacy code - we continue to think about making some changes. And before making any changes we bring the legacy code under test - or at least the parts where the changes will be. This is the hardest part. I won’t go into any details about how to bring it under test here, but I strongly advise reading Michael Feathers’ excellent book Working Effectively with Legacy Code. In his book Michael presents a large set of strategies that will help you to get your legacy system under test.

Testing the legacy code mostly is time consuming. Still it must be done, there’s no alternative. We can’t fiddle in some changes and only make things worse, we’re professionals. And besides, it wouldn’t help at all. I’ve seen lots of changes fiddled into untested code, almost all with the same result: the introduction of some or even many defects. But like I said, we’re professionals, we don’t fiddle around in untested code.

When the legacy code is under test and we have a sufficient amount of coverage, we can refactor. Because the code is under test we are confident that when we break something, the tests tell us so we can undo our last change. We are no longer afraid of changing the code.

After the code is refactored, the changes can be implemented using TDD. Focusing on not changing the legacy code, doesn’t mean that it’s better to duplicate code or features. It means that legacy code should not be changed if you can avoid it. If you must reuse some feature already implemented in legacy code, you bring it under test and refactor to reusable components.

Never ever try to fix unreported issues in legacy code, even if the change seems so trivial. Sometimes code can be disguised like a bug and even smell like one, but in the end it may turn out to be a feature of the legacy code.

A good example of such a feature in disguise is one I heard in a chat with a colleague last week. He told me that on his project they face a lot of nasty legacy code, and how one of his coworkers changed a string comparison by reference (==) to compare the strings using equals. The change made it to production and, as you may expect by this point, ended up to cause a production defect. The comparison by reference, no matter how much of a code smell, was a feature in disguise and shows how dangerous it is to change any untested code based on assumptions.

To fix the production defect, they reverted the change and put a comment in the code to tell others not to change the comparison. By doing this, they left the code in worse shape then they found it. They added a comment to express their fear of changing it.

Instead, by the time the defect was reported, they should have started writing tests. Clearly they were changing code they didn’t understand. Whenever you must change code you don’t understand, you must refactor first. No matter how trivial the change! And to refactor legacy code, you must get it under test first.

This is how I would like everybody to treat legacy code: with extreme care. I hope this post will be of help to you, and remember: never be the one that creates the legacy code! There is no reason not to test your code, ever.