Attending Belgium Testing Days last month got me thinking about a potentially radical approach to regression testing, mostly inspired by Julian Harty and Nathalie Roseboom de Vries van Delft. This might be an unpolished idea, so bear with me.
Julian spoke about alternative testing – or better put alternatives to testing – arguing that time to market is a key competitive advantage for Internet businesses and that exhaustive testing damages that. He cited examples of Facebook, Flickr and Google as companies that do not perform exhaustive regression testing before releases but deal with problems effectively when they appear. “Looking at production data is a much more effective way of discovering problems”, said Harty, adding that “We need to find ways to detect undesirable effects. Do root cause analysis and implement fast, robust mitigation”. He suggested using software equivalents of canaries in a coal mine, such as partial deployments, monitoring server logs and implementing automated checks that detect problems quickly.
Exhaustive regression testing mitigates the risk of functional problems but it isn’t the only way to do so. Several effective techniques for building quality into products from the start, combined with production monitoring and the ability to solve issues quickly, might cost a lot less and de-risk most expected problems cheaper than the time and effort required for regression testing, even if you write automated tests for other purposes.
This reminded me of one of the most puzzling findings from the survey of effective teams I conducted for Specification by Example. The team at uSwitch disable most tests after they pass initially. They use tests to guide development but (most of them) not for regression testing. It sounds counter-intuitive, but it works for them. They still maintain a great level of quality. When I spoke to them in May this year, they couldn’t remember when they caught the last serious issue was found in production. Instead of regresssion checks, they deliver small increments of functionality to increments of production environment and monitor user experience with automated tools. This derisks horrible problems significantly.
In Estimating Software Costs: Bringing Realism to Estimating, Capers Jones estimated that regression testing catches only 23% of the problems. Brian Marick wrote a long ago about a similar finding: 30% problems caught while running regression tests repeatedly. If that is true, slow exhaustive testing might actually cost some businesses more in lost opportunity and slower time to market than the problem that it would prevent. I don’t think that this is a general case, for example in extremely high risk or heavily regulated environments, but for many Internet businesses this might actually be viable.
This of course leaves the problem of Black Swans, completely unexpected and unpredictable issues. (There is a valid argument that exhaustive regression testing can’t prevent such things anyway.) How about de-risking these things as well?
We don’t know what a potential Black Swan could be (or otherwise it wouldn’t be a Black Swan) but we can train the organisation to deal with such problems effectively. Nathalie Roseboom de Vries van Delft presented at Belgium Testing Days about her experiences in (quite bizarre) simulations of disasters. She participates in simulations of floods and re-enactments of airplane and maritime disasters to check how rescue and emergency services deal with such events. Government agencies organise those simulations to maintain readiness and ensure that such events will be dealt with effectively if they occur. These events help to identify issues such as communication problems. For example, one conclusion from a flood simulation was to avoid using acronyms as rescuers from different countries understand them differently. Simulations also give people a chance to experiment with various approaches – for example mixing international rescue teams or using teams of people from the same country to solve particular tasks.
Government agencies aren’t the only ones who need to handle disasters effectively. Businesses could benefit from this too. In the January-February issue of the Harvard Business Review, Farzad Rastegar wrote about his experiences with a Black Swan – when the news about an upcoming recall of his company’s baby pushchairs was leaked to New York Daily News and published one day before the planned press release. His company, Maclaren USA, was preparing for the recall for months with resellers and regulators, but the news leak caught them by surprise. Their mail servers went down, communication likes broke and the damage was hard to control. It took a lot of time, effort and money to clean up the mess, although they spent months preparing. Rastegar concludes that they learned quite a lot about the company as a result of that and decided to change the reselling structure to be able to deal with such problems better in the future. After hearing Natalie Roseboom, I can’t help but think that periodic disaster simulations could help much better.
Maybe all these ideas can be combined. Organisations can deal with smaller issues through production monitoring and “canaries”, at the same time ensuring that they can handle Black Swans effectively through training and simulations. An organisation could simulate injecting a problem and see how the copes with it, both from a technical and a business perspective, identify mistakes and make the processes more effective. The cost of such simulations would be high, but it would significantly derisk issues that are unpredictable and lower the cost of handling such problems. By definition, they happen very rarely, this could be a good strategy to justify dropping regression testing completely and getting much faster and cheaper deliveries.