I have recently started a major stream of work centered on a
particular application in the LMAX stack. This application has had
plenty of features added to it over the last few years, but nothing
has really required an overhaul.
Our work, however, is somewhat more involved; even finishing the
simplest of our requirements has been taking a week or so – that’s a
long time, for us.
Hitting the buffers
Our method, to begin with, looked something like the following:
- Write acceptance tests for feature (we tend to batch these up – it helps us explore the story)
- Write integration tests for our application, supporting the feature (these usually resemble the ATs)
- Spike implementation within the application
- Use knowledge gained from spike to drive refactoring
- Repeat the last two steps until the ATs and ITs pass
We’re very much in the Kent Beck school of development here:
First refactor the program to make it easy to add the feature, then add the feature
Our problem was that refactoring the program was hard! We discovered
that while making the ITs and ATs pass was easy, getting the unit
tests to compile and pass was much harder.
This was frustrating; not least because the unit tests were of the
overspecified mock interaction sort. If we moved even the smallest
piece of validation, anywhere from one to a hundred unit tests would fail.
Symptom, not cause
We blamed the tests – they were stupid tests, we said; why had anyone
bothered to write them? So, we tried to rewrite a couple of unit tests
in a more lean style – just creating what we needed to test our new
This felt a lot better right up until we finished, when we looked
from our new tests to the old tests, and from the old tests to the new
tests; but already it was impossible to say which was which.
These tests were a symptom that the code underneath was
jumbled. Someone had attempted to break up large, core domain objects
into separate responsibilities by pulling behaviour up into
‘processor’ objects, which had made things smaller but also broke
encapsulation. More on this another day.
This was novel – here was a case where the wrong refactoring had
painted us into a corner. The problematic tests this ill judged
refactoring wrought besmirched all attempts to escape to a better
Declaring unit test bankruptcy
We decided to remove these unit tests. They were creating a catch 22
situation: we couldn’t refactor the code without breaking the tests,
and we couldn’t make the tests better without fixing the code.
We ended up working like this:
- Write acceptance tests for feature
- Write one integration test for the application (a deliberately smaller step)
- Spike implementation within the application
- Run unit tests with spike code to detect pain
- Rewrite those unit tests as integration tests
- Delete the painful unit tests
- Revert the spike, and use knowledge gained from spike to drive refactoring
- Make new integration tests pass with well factored code
- Continue until all the ATs pass
This allows us to make swingeing refactorings safely; speeding our
journey towards a place where we may one day be able to TDD all the
We were lucky to have:
- A mature integration tests framework.
This made writing the new tests to assert only on the I/O events from our particular application easy.
- A single threaded application (so integration tests are almost as quick as unit tests, and they don’t suffer from races)
- Extensive AT coverage over the system as a whole.
Beware though, for these are double edged swords. Perhaps it is
because our framework makes ATs and ITs so easy to create that we
neglected the factoring of the code within.
It seems we have been guilty of declaring stories done when the
acceptance tests all pass. If only life were that simple!
In TDD, at the unit level, the method is as follows:
- (write new test) Red
- (make test pass) Green
Here, ‘refactor’ is usually removal of duplication, and separation of
responsibilities into separate classes.
We need to execute the refactor step ‘all the way up’.
The refactor step for ITDD and ATDD
I wrote a sort of checklist of things that I think about; but they were:
- too specific
- probably wrong
Instead, I advise instead that all one needs to do at this point is to
stop and think. More specific advice is left as an exercise to the
reader. (Hint: Think of the principles you apply at the unit level –
can you scale them up to the level of systems and applications?)
- Listen to your tests!
We could have avoided this whole affair if we had listened to the tests at the time of writing.
- Make sure your definition of done includes the ‘refactor’ step.