Wed 11 Jul 2007
Unit Testing by the Status Quo
Posted by Dennis B under Productivity, Programming
No Comments
Have you ever been faced with code that has no tests, and every time you fix one thing, you somehow manage to break something else?
If not, then dang. You’re done reading. Go home, have a cold one.
Otherwise, you are in good company, so don’t feel so bad. Come down off the ledge.
What you really want — is to be able to fix bugs in one part of the code and keep the rest of the code functioning in the same working or non-working state it was in before. But the code is woven like spaghetti, and you swear under you breath every day that the guy who wrote it should have his fingers chopped off. This mantra makes you feel better, but unfortunately it doesn’t fix any bugs.
Unit Testing
It’s very academic. If the code had tests, then you could fix things, run the tests and know if things break. So, biting the bullet and knowing you are volunteering to spend your foreseeable future writing incredibly boring tests, you go to your manager and you say, heh, Bob, I’m gonna take six months and write thorough unit tests for this library, then we won’t have these horrid problems anymore. Of course, Bob is not excited about losing his star developer for six months and tells you no, you have more important things to spend your time on — like fixing the bugs you create by fixing bugs.
Next you ask your boss, maybe we could hire an intern to write unit tests? Your boss considers the amount of damage that could do to a young, idealistic programmer’s mind, then tells you no, can’t do that –in fact, it should be illegal, it’s cruel, like chopping off people’s fingers — you start to wonder whose side he is on anyway…
Why is it so hard?
Why are unit tests so hard to write and also so boring to write? That’s simple. The traditional approach to unit testing tries way too hard, because the traditional approach is focused on a proof of correctness. You start writing tests for APIs and then putting in a bunch of assertions with respect to the results, and the assertions are intended to capture 100% correctness.
There are two major problems to this approach.
- It is very tedious.
- You know that the code isn’t 100% correct to start with, otherwise, you wouldn’t be in there fixing bugs.
The result is that unit testing indirectly becomes a bug-hunting exercise, because you write tests, they fail, and you are left to figure out if your test is wrong of if the code being tested is wrong. You don’t want to go home until you have unit tests that cover a large part of the library, and pass. In the process, you wind up fixing several bugs in the code you are testing — which is really not what you wanted to do — not until you had complete tests.
If you have good will-power and a unit testing harness that supports it, you can avoid the bug fixing trap by simply ignoring test failures until the entire test suite is complete. Most people don’t have this kind of patience, because by the time you debug your unit test, and know it’s really the code being tested that has a bug, you probably also know how to fix it, and it’s hard to resist.
Status Quo
From Webster’s dictionary: Latin: the state of which. The existing state of affairs: also status in quo.
With the exception of the bug you are fixing, the status quo is precisely what you want for the rest of the code’s behavior.
So, without tests, how can you know that unravelling one piece of spaghetti code will not cause a metaphorical meatball to roll off the table? Bad news time. You can’t. You need tests and you need to be able to put them together fast.
Unit Testing by Status Quo
A solution to this problem is to design a way to generate unit tests that feed input to the existing code and captures the results. The key here is that you are capturing results — whether they are right or wrong — what matters is the status quo. Forget about proof of correctness. We’re looking for proof of same-ness.
Put the inputs and expected results into a harness so that you can re-run the tests and compare the new results with the expected results and know when something changes.
Once you have this harness in place you will become a scavenger for raw input, more test cases, to feed the harness. The more the better. The harness needs food. That will become your new mantra. That other guy can keep his fingers.
Using this technique, you can achieve a high percentage of code coverage with a minimal amount of effort. Also, since the code now has effective unit tests, you can now pass maintenance of the problematic library to another developer and know that they have some guard rails when they go in to fix bugs.
Managing change
With the status quo testing harness in place, when you make changes to the code, one of three things will happen:
- All the tests will continue to pass. Either this is good, if the change was a refactoring or a performance improvement. Or, this is bad, it means that your fix didn’t fix anything.
- There are tests that fail, and you expected them to fail because your fix was going to change the code’s behavior. In this case, you review the changes and regenerate the expected results.
- There are tests that fail, and they should not have failed. Then, you know that you broke something, and you need to work more on your fix and try again.
Case Study
In the span of two weeks, I was able to assemble a huge battery of tests for one SlickEdit library that had a total of around 50,000 lines of code.
In this instance, the code being tested just took text as input and called functions in another library. In order to isolate the test code, and capture the results, I stubbed out the functions being called and replaced them with functions that just printed out their name and arguments. For example:
1
2
3
4
5
6
7
8 void doSomething(int x, int y)
{
printf("doSomething(%d, %d)n", x, y);
}
void saySomething(const char *s)
{
printf("saySomething(\"%s\")n", s);
}
The results looked like a big function trace:
1
2
3
4
5 doSomething(1,2)
saySomething("Hello")
doSomething(3,4)
doSomething(5,6)
saySomething("Goodbye");
Comparing the expected results to the actual results using a simple line-by-line differencing algorithm allowed me to zero in on test failures and quickly review whether or not they improved the status quo or not. Building a regenerate feature into the test harness allows me to quickly update the test suite after reviewing everything.
Before these unit tests, due to the code’s complexity, nearly every fix was a roller-coaster ride. Since the unit tests, we have had zero incidents of fixes introducing new bugs. It has also given us much more freedom to undertake refactoring and cleaning up the code, since we have confidence that even sweeping changes will be unlikely to cause major breakage.
Conclusion
Unit testing by status quo is a simple method for insuring continued code quality with a minimal investment in test development. It does not focus on 100% proof of correctness, but instead focuses on allowing you to make continuous improvements to a code base to push it uphill towards 100% correctness without allowing for sudden backslides and unexpected breakage.