The Silent Killer in Your Code
There is a silent killer lurking in every code base. A killer that destroys teams and companies alike. You can fix this killer by applying technical practices and refactoring, but only if you know what to look for.
This killer has no name. I will call it “legacy code.”
What does “legacy code” mean? It’s basically any code that is not the latest thing you created. But more specifically, legacy code is when you have to work on a piece of software where changing something has unexpected side effects. Tests pass, but changing one part of the system breaks another part of the system that was working just fine before. This is legacy code.
Legacy code means that you are afraid to change your code because you don’t know what will happen when you do. This fear has a negative impact on your ability to deliver new features, fix bugs and maintain your software product over time.
I’ve been thinking about legacy code a lot lately. I read an article by Michael Feathers that said legacy code is code without tests. I’ve also been doing a lot of Refactoring lately, and working on improving the design of existing systems. A lot of the systems I work on have no tests at all, and don’t even have the concept of a unit test.
So when I hear people talk about legacy code, I think they are talking about me and what I do everyday.
And this is scary. Because if legacy code is just code without tests, then I’m working in a minefield every day. If we take “legacy” to mean “code without tests”, then unless you’re writing everything from scratch, you’re working with legacy code.
I’ve worked on a few codebases that were so fragile that one wrong change could cause hours or days of work fixing the inevitable bugs. It was frustrating and demoralizing.
I especially hated it when I’d make a change, the tests would pass, and then something else would break later. I dreaded fixing bugs because I never knew how many other bugs my fix would create.
That’s what legacy code is like. You never know when you’re done.
The best way to get acquainted with legacy code is to take over an old project from another team. The code will be a mess and you’ll probably hate it at first, but you’ll learn a lot from the experience. If you ever get to start your own project from scratch, you’ll know how important it is to keep it clean from day one.
Before we begin, here’s a story.
A few days ago, I was doing some work on an internal project. I needed to add some functionality and make sure it worked with the existing code base. After working on it for a while, I came to the conclusion that what I was working on was impossible.
I was a little surprised. The idea was simple enough and in principle should have been fairly straightforward to implement in software. So why wasn’t it working?
After some head scratching, I realized that there were two problems:
1) The original code base was terribly designed and was very hard to figure out how things interacted with each other.
2) There wasn’t any documentation available on the project, so I had to infer what the system did from reading the code itself (which wasn’t easy).
This is what is known as legacy code. Legacy code is code written by someone else that you don’t understand or want to change. Legacy code can be found everywhere. If you’ve ever inherited a project or tried to make changes on an old system without breaking anything else, then you’ve probably dealt with legacy code at some point in your career.
Legacy code is one of those things that most developers hate but often have no choice
Legacy code is a killer. It slows you down, limits your agility and prevents you from taking advantage of new technologies.
If only you had the time to rewrite it all. But you don’t. Like many programmers, you have deadlines to meet, new features to add, and time is short. You need a quick and effective way to handle legacy code; something that will get the job done with minimal effort. And that’s where this book comes in.
Michael Feathers defines legacy code as “code without tests.” Without tests, he says, it’s hard to change the behavior of a program reliably. This book shows you how writing tests can help you understand legacy code structures and how to change them safely.
Written for programmers with a knowledge of object-oriented development, this book includes practical advice for dealing with legacy codebases using techniques such as breaking dependencies, introducing layers and writing isolation tests. You’ll learn how to:
Apply a series of small behavior-preserving transformations
Safely change legacy code without breaking any tests
Write tests around legacy code bases without changing them
Apply these techniques in Java or C
Legacy code is a problem because it’s usually so tied up with business rules that changing it is hard. But there’s a solution: don’t change it. Instead, write new code that calls the old. For example, if your business logic is in procedures stored in the database, use a repository pattern to create wrappers around those procedures. That way, you can add new code that uses the wrappers instead of the legacy code itself.
If you’re looking at some legacy code right now and thinking “I can’t do that,” don’t worry. Wrapping legacy code isn’t as difficult as it sounds. Here are some guidelines for dealing with legacy code:
1. Don’t change the legacy code itself. Ever. Treat it as a black box. Instead, create an interface that acts as a bridge from your new system to the old one.
2. Start by writing tests against the legacy system so that you have something to guide your refactoring efforts. Make sure they’re fast and reliable; if they’re not, most developers won’t run them every time they make a change to their own code.
Let’s talk about legacy code.
Consider this quote from a book on software testing:
“The term legacy code usually refers to source code that predates the current project.”
I read this quote and I think, “What a ridiculous definition.” What is the alternative? That you can only call it legacy if it’s older than you? If you’re working in a shop that has not had a major rewrite of their code base for thirty years, then by their definition your whole system is legacy. By that definition, the majority of software in the world is legacy.
And the problem is not just semantic nitpicking. The idea that “legacy” means “old” leads to some really bad ideas about how to deal with old code. Consider these quotes:
“You might be surprised at the problems that can be solved simply by upgrading your libraries or language.” (emphasis mine)
Or the following quote:
“A good rule of thumb is to avoid using any functionality that was introduced more than two versions ago.”