Test-Resistant Code #3

*This is the third part in a mini-series on code that is hard to test (and therefore hard to develop using TDD). *Part 1: Test-Resistant Code and the battle for TDD, Part 2: Test-Resistant Code – Markup is Code Too

You may have seen Uncle Bob solve the prime factors kata using TDD. It is a great demonstration of the power of TDD to allow you to think through an algorithm in small steps. In a recent talk at NDC2011 entitled, “The Transformation Priority Premise” he gave some guidelines to help prevent creating bad algorithms. In other words, following the right rules you can use TDD to make quick sort, instead of bubble sort (I tried this briefly and failed miserably). It seems likely that all sorts of algorithms can be teased out with TDD (although I did read of someone trying to create a Sudoku solver this way and finding it got them nowhere).

However suitable or not TDD is for tackling those three problems (prime factors, list sorting, sudoku), they share one common attribute: they are easily verifiable algorithms. I know the expected outputs for all my test cases.

Not all algorithms are like that – we don’t always know what the correct output is. And as clever as the transformation priority premise might be for iterating towards optimal algorithms, I am not convinced that it is an appropriate attack vector for many of the algorithms we typically require in real-world projects. Let me explain with a few simple examples of algorithms I have needed for various projects.

Example 1 – Shuffle

My first example is a really basic one. I needed to shuffle a list. Can this be done with TDD? The first two cases are easy enough – we know what the result should be if we shuffle an empty list, or shuffle a list containing one item. But when I shuffle a list with two items, what should happen? Maybe I could have a unit test that kept going until it got both possibilities. I could even shuffle my list of two items 100 times and check that the number of times I got each combination was within a couple of standard deviations of the expected answer of 50. Then when I pass my list of three items in, I can do the same again, checking that all 6 possible results can occur, followed by more tests checking that the distribution of shuffles is evenly spread.

So it is TDDable then. But really, is that a sensible way to approach a task like this? The statistical analysis required to prove the algorithm requires more development effort than writing the algorithm in the first place.

But there is a deeper problem. Most sensible programmers know that the best approach to this type of problem is to use someone else’s solution. There’s a great answer on stackoverflow that I used. But that violates the principles of TDD: instead of writing small bits of code to pass simple tests, I jumped straight to the solution. Now we have a chunk of code that doesn’t have unit tests. Should we retro-fit them? We’ll come back to that.

Example 2 – Low Pass Filter

Here’s another real-world example. I needed a low pass filter that could cut all audio frequencies above 4kHz out of an audio file sampled at 16kHz. The ideal low pass filter has unity gain for all frequencies below the cut-off frequency, and completely removes all frequencies above. The only trouble is, such a filter is impossible to create – you have to compromise.

A bit of research revealed that I probably wanted a Chebyshev filter with a low pass-band ripple and a fairly steep roll-off of 18dB per octave or more. Now, can TDD help me out making this thing? Not really. First of all, it’s even harder than the shuffle algorithm to verify in code. I’d need to make various signal generators such as sine wave and white noise generators, and then create Fast Fourier Transforms to measure the frequencies of audio that had been passed through the filter. Again, far more work than actually creating the filter.

But again, what is the sane approach to a problem like this? It is to port someone else’s solution (in my case I found some C code I could use) and then perform manual testing to prove the correctness of the code.

Example 3 – Fixture Scheduler

My third example is of an algorithm I have attempted a few times. It is to generate fixture lists. In many sports leagues, each team must play each other team twice a season – once home, once away. Ideally each team should normally alternate between playing at home and then away. Two home or away games in a row is OK from time to time, three is bad, and four in a row is not acceptable. In addition it is best if the two games between the same teams are separated by at least a month or two. Finally, there may be some additional constraints. Arsenal and Tottenham cannot both play at home in the same week. Same for Liverpool and Everton, and for Man Utd and Man City.

What you end up with is a problem that has multiple valid solutions, but not all solutions are equally good. For example, it would not go down too well if one team’s fixtures went HAHAHAHA while another’s went HHAAHHAAHHAA.

So we could (and should) certainly write unit tests that check that the generated fixtures meet the requirements without breaking any of the constraints. But fine tuning this algorithm is likely to at least be a partially manual process, as we tweak various aspects of it and then compare results to see if we have improved things or not. In fact, I can imagine the final solution to this containing some randomness, and you might run the fixture generator a few times, assigning a “score” to each solution and pick the best.

Should you use TDD for algorithms?

When you can borrow a ready-made and reliable algorithm, then retro-fitting a comprehensive unit test suite to it may actually be a waste of time better spent elsewhere.

Having said that, if you can write unit tests for an algorithm you need to develop yourself, then do so. One of TDD’s key benefits is simply that of helping you come up with a good public API, so even if you can’t work out how to thoroughly validate the output, it can still help you design the interface better. Of course, a test without an assert isn’t usually much of a test at all, it’s more like a code sample. But at the very least running it assures that it doesn’t crash, and there is usually something you can check about its output. For example, the filtered audio should be the same duration as the input, and not be entirely silent.

So we could end up with tests for our algorithm that don’t fully prove that our code is correct. It means that the refactoring safety net that a thorough unit test suite offers isn’t there for us. But certain classes of algorithms (such as my shuffle and low pass filter examples) are inherently unlikely to change. Once they work, we are done. Business rules and customer specific requirements on the other hand, can change repeatedly through the life of a system. Focus on testing those thoroughly and don’t worry too much if some of your algorithms cannot be verified automatically.