There is a point during the implementation of acceptance testing where it is enough to give you some value, but there is still a lot of friction with the tools. At that point, many teams start challenging whether the thing is paying off or not. Here is a simple way to prove that to yourself.

It's hard to see things that are not there

Questioning the whole process happens because agile acceptance testing helps teams remove several causes of waste – the waste of translation, the waste of writing big documents that nobody reads, the waste of bugfixing and the waste of rework caused by misunderstanding. But, things that are no longer there are hard to see. A visible problem bothers people - so it will be discussed during a retrospective and the team will work on removing it. Understanding things that are not there at the same time is a bit harder. Teams won’t necessarily see that bugs are not coming back into the iteration, but they are going to notice time spent on writing and maintaining the integration layer for the automation.

I was also surprised that most teams don’t use any kind of metrics while they are trying to improve. The topic of metrics in software development is much larger than one blog post can cover, and it often ends with discussing the danger of metrics becoming the deliverable and local optimisations causing efficiency over effectiveness. But some metrics are incredibly useful when trying to change a process. They show whether the change has succeeded or not. They can also show you things that are no longer there. So here’s a suggestion for anyone out there starting to implement agile acceptance testing: track your boomerangs.

A boomerang is anything that comes back into the process: a story or a product backlog item that the team thought was done, but it actually needed more work. Boomerangs don’t include things caused by genuine marketing changes after a feature has been deployed, but as a rule of thumb I consider anything that causes a story to come back in less than a month after it was released a boomerang. This might be because the testing is spilling over into the next iteration, required development cleanup, genuine misunderstanding between what the client wanted or bugs being raised once the feature goes live.

Track what you want to change

Boomerangs are a simple but incredibly useful measure. As you implement agile acceptance testing, the number of boomerangs should go down significantly, up to a point where it is fairly rare for them to occur. (At that point, just stop tracking them). Looking at your boomerang trends over the last several months will show you how much you have actually improved. If the rate does not drop, it means that there is something wrong with the way you implemented the process - so this measurement will tell you whether your team follows a cargo cult or not. If the number of boomerangs goes down, then there is a strong argument to say “if we stopped using acceptance testing, this is what we’ll lose” objectively.

It is also useful to understand where the boomerangs are coming from. One of my clients had boomerangs mostly coming from one department in the company. This pointed to a communication problem with that particular department and made them look for better ways to engage these people.

Tracking boomerangs doesn’t take a lot of time – usually a few minutes every iteration – and it can help a lot when the time comes to challenge or prove that the process is working. In larger companies, it can also provide compelling evidence that this process is worth doing with other teams. For more complex statistics, you can also track the time spent on boomerangs, because this figure directly translates into wasted development/testing time and money. If you have the data to show that more time was spent on bugfixes and boomerangs before than what is now spent on maintaining fixtures, there is a clear business case to continue using acceptance testing.

In general, although metrics might be dangerous as an external measure, I find them useful to track the thing I want to change. Once people start noticing the tool friction, I’d start tracking time spent on fixtures and then have real data to compare a month or so down the line. Then I would work on reducing the friction - understanding why tests are hard to maintain and automate and improve that instead of throwing out the baby with the bathwater. Once that problem goes away, I’d stop tracking time spent on writing fixtures. These metrics never have to leave the team, so there is no danger that they will be used as an external measure of success. But they can be useful to tell you whether the change has succeeded, and when people start challenging what was done later.

It might also help to understand that automation effort is often front loaded. With most teams I interviewed, after the point where they started looking for ways to reduce the friction, the integration layer evolved into almost a domain specific language for specifications/test automation. Most of it got reused for new tests and specifications in the future. There is a disproportional amount of effort involved in writing a good integration layer in the first place and evolving it so that it becomes reusable and maintainable. In average, this took teams in my research about three to six months. So if you are at the point where things look grim - hang in there, it just takes a bit of effort to save a lot more time later.