From the time of ancient Greeks who marvelled at Hercules, to today’s Hollywood fans who marvel at, I guess, Marvel’s superheroes, everyone is inspired and captivated by tales of laborious tasks against difficult odds. Simply put, great effort makes great stories. And in many of those stories, the greatness of the effort is actually more important than the outcome. Thirty years after Rocky Balboa first ran up the flight of stairs below the Philadelphia Museum of Art, millions of people still play that scene in their heads for motivation. It doesn’t matter that Rocky actually lost the fight at the end of the first film. The eagle that brought Gandalf to Rivendell could have easily air-lifted Frodo to Mordor, but then three glorious books would never see the light of day, and instead we’d get a short uninteresting pamphlet. Perhaps someone from the New Zealand tourist office bribed the birds to conveniently disappear at the start of the Fellowship, or they just worked for Lufthansa and that was their usual time of year to take industrial action.
With such a cultural obsession about effort, it’s no wonder that the software industry is also enchanted by it. Late-night heroism earns respect, and nobody asks if the heroes were the ones that designed the server code to crash so often in the first place. Products delivered with crazy working hours earn promotions for managers, without anyone even asking if those same managers committed to a complete fantasy. This obsession with great effort is most dangerous when it affects groups of teams. It can easily cause months of people’s lives to go down the drain with nothing to show for it. And for any organisation thinking about scaling agile, avoiding this mass-hypnosis is crucial for actually doing something better, rather than just doing it differently.
Let me digress for a moment, because I don’t want to build my case of magical characters such as Saruman, Sauron or Stallone. I want to tell you a story that you probably already know, but bear with me for the moment, so we can examine it from a slightly unusual perspective.
Seventy-one years ago, more than 600 courageous airmen risked life and limb to dig not one, but three tunnels under Stalag Luft III, the infamous Nazi prison camp. The Nazis specifically designed the camp to prevent tunnelling. They built it on sandy ground, which was likely to cave in. They placed seismographs to detect digging. They raised sleeping barracks about half a meter above the ground, and placed them far from the fences. To avoid detection, the escapees had to dig human-wide pipes nine meters below ground, through a hundred meters of sand. Tunnels so long and so deep require a lot of supporting infrastructure, or an escape route can quickly turn into a mass grave. Unfortunately, Amazon Prime didn’t exist back then, so the clandestine Bob-the-Builder crew to had steal wires and building materials from the prison camp infrastructure. That kind of inventive supply-chain management is risky even during peaceful times, but back then discovery must have meant a certain death. Three times hundred meters of human-wide pipes is a lot of dirt to hide, so two hundred people took almost 25,000 trips up and down, carrying sand in their trousers and scattering it inconspicuously. Finally, remember that this wasn’t a movie, so none of the prisoners had the lung capacity of Tim Robbins from the Shawshank Redemption. Bob Nelson, the Leader of 37 Squadron, earned a place in engineering history for inventing an air pump system built out of bed pieces, hockey sticks and knapsacks. On March 24th 1944, a group of 76 prisoners escaped through one of the tunnels, nicknamed Harry (yes, of course, the other two tunnels were called Dick and Tom). This is the stuff legends are made from, so no wonder the whole story was immortalised by Hollywood as The Great Escape, in the only way that was reasonable in the sixties — in DeLuxe colour starring the indestructible Steve McQueen. The movie brought in millions in the box office (which was a huge thing in 1963), it still has a 93% fresh score on Rotten Tomatoes (which is a huge thing in 2015), and it left a lasting impression on modern culture. This, of course, includes the animated children’s version The Chicken Run, which itself earned 250 million dollars.
The sheer scale of this effort completely overshadowed another concurrent action in the same prison camp. Faced with the same constraints, and the same external parameters, a smaller group of prisoners came up with an alternative plan. They built a gymnastics vaulting horse out of plywood and repurposed Red-Cross parcels. Under the guise of exercising, they placed the wooden horse close to the perimeter fence every day and dug a tunnel below. They worked in shifts of one or two at a time, digging only using food bowls. At the end of each day, the prisoners placed a board on top of the tunnel entrance and masked it with surface dirt, using the hollow wooden horse to take away the dirt. Because the tunnel started close to the fence, it didn’t have to be very long. The fake exercising also provided the cover for seismographs in the area, so the tunnel did not have to be very deep. This allowed them to just poke holes through the surface for fresh air. At the end, the tunnel was roughly 30 meters long. Though no small feat to achieve with bowls, it was just one third of the length of a single tunnel for The Great Escape. The plan did not require anyone to steal materials from the prison. There was no need for inventive ventilation engineering. It took only three months to build, compared to almost a year for Tom, Dick and Harry. To the best of my knowledge, the tunnel didn’t even have a nickname. Only three people escaped through it, so no wonder the whole enterprise isn’t that well known. Someone made a film about it, of course, because WWII scripts were a license to print money back in the sixties, similar to superhero movies today. With a feeble plot and such a small effort, Steve McQueen wasn’t even on the cards, and the movie didn’t come close to the cult status of The Great Escape. This is why, I guess, Tolkien didn’t consider my avian solution.
The Wooden Horse plot was a tiny undertaking, allowing only three people to escape, but all three of them reached freedom. The Great Escape engaged hundreds of workers for a year and allowed seventy six of them to escape. With so many variables in play, something just had to go wrong. The tunnel exit was too close to the fence, so the escapees were spotted by the guards on their way out. Most were caught the next day and returned to the camp. The majority of them, around fifty, were executed. At the end, only three out of the seventy six reached safety. Counting people who actually escaped, the outcome of both these attempts was the same. Counting the cost, both in effort and human life, the Wooden Horse is a clear winner. Yet our society celebrates and glorifies the second one, which is even today known as The Great. Somehow, this feels as the completely wrong way to measure greatness.
Unfortunately, it’s scaling effort, not outcome, that makes good stories. And that’s a reality that can’t be ignored. ‘Managed a 10MM project involving 300 people on two continents’ makes a resume stand out, regardless of whether the client got squat for all that money. That kind of scaling scares me quite a lot, and if you’re working in an organisation that is thinking about rolling out some variant of large-scale agile at the moment, it should scare you too. In most of the discussions on scaling agile at conferences and in books, the word ‘scaling’ just means doing things with more effort: distributing work to more locations, engaging larger groups, involving more teams. There’s a naive assumption that more effort brings better outcomes, but that is rarely the case. And the reason why people make that assumption turns out to be quite important for why it’s so often wrongly taken for granted.
On a small scale, effort does boost outcome. If a single person puts in two hours of work, they are likely to get two times more work done than in just one hour. If a team puts in hard work over a few weekends, they might hit a critical deadline. On a larger scale, effort no longer directly relates to results. With dozens or hundreds of people, and months of available time, Parkinson’s law kicks in. The small-scale visible impact of effort, and the illusion that it brings, makes people delusional. The poster-child for this in software delivery is the FBI Sentinel project. For two years, the Sentinel programme managers had bi-monthly meetings with senior FBI stakeholders, showing status charts and a project thermometer, which unsurprisingly always showed yellow trending towards green. In October 2009, the main contractor missed its deadline for delivery of Phase 2, and all the critical problems ‘came crashing down on the project’. An independent audit concluded that there were more than ’10,000 inefficiencies’. The project management approach, as it turns out, focused solely on tracking activity, leading to more than 450 million USD being spent before anyone noticed that something isn’t quite right.
Because more effort works on a small scale, it allows people to feel a progression of small wins, and that blinds them to the overall mess. Organisations become like Batman, who defeats bad guys in each of the hundreds of graphic novels and dozens of films, but doesn’t realise that his strategy never even comes close to solving the overall problem. Investing some of Bruce Wayne’s apparently unlimited cash in social services and city infrastructure would surely have a much larger effect on Gotham than all that crime fighting. Then again, great effort makes good stories, and philanthropy is a lot less interesting than going around and punching individual villains in the face.
So here we are in 2015, in an industry that mostly equates effort to progress. Novice Scrum teams seem to be particularly obsessed by velocity and story-points, and sell them to gullible stakeholders as indicators of value. Unfortunately, both of those metrics just show the effort spent. To put it plainly, they show money flushed down the toilet. And the easiest way to scale that is to buy more toilet seats. Doug Hubbard noted a long time ago that organisations prefer metrics that are easy to measure, without even considering if they that are important or not. Outcomes are viciously problematic to nail down, even outside software. They come on a delayed feedback cycle and depend on too many external factors. Plus it’s very difficult to agree on what actually to measure. That’s why large companies have highly creative earning/profit accounting schemes. On the other hand, effort can be defined and measured cheaply. It can easily be added across teams and time. Even more importantly, it be easily multiplied with money.
Assuming that the easiest thing people will try to do is to increase effort, how do we enable that to relate the outcomes on something that doesn’t taper off? There are plenty of theoretical biology quotes out there attributed to Fred Brooks and Warren Buffet on the combination one child, nine months and multiple mothers. They all come down to the critical factor of inter-team dependencies. On the far-end of that scale, if teams are perfectly independent, each one can run their own show in parallel. People working on different products just don’t have to wait for each-other. Though better outcomes are not guaranteed, the organisations as a whole at least get a fighting chance to achieve more. But there are plenty of ambitious software products that can’t be written by a single team in the timelines of technology half-life today, so completely getting rid of dependencies often isn’t possible. But we should at least try to minimise them. You don’t need a smart Fred Brooks quote to tell you that — anyone could come up with that in their sleep. The problem is that there are multiple ways to minimise dependencies, and our cultural obsession with effort often makes organisations choose the wrong one.
To over-simplify the situation, let’s agree for a moment that there are at least two types of inter-team dependencies. I’m sure you can think of more, but these two will be enough for now. One limits how much work can start independently of other teams. The other type limits how much work can be finished independently. And they are often not even closely related. The first group of dependencies is mostly solved by management action— reorganising where people sit, who they report to, and what they work on. Hiring a few more people is often an easy way to unlock those dependencies, and as an added bonus it increases the amount of effort people can put it. The second group of dependencies mostly requires technical solutions, and it’s a lot more difficult to act on or communicate. At some point, it becomes incredibly difficult to explain why the entire organisation has just one production-like testing environment, but everyone takes it for granted that installing a second one would cause the universe to end in a singularity event. It’s just easier to hire more people than to deal with that risk.
Because idle time is worse than heresy in most organisations, the first type of dependencies typically gets all the attention. A while ago most organisations were dividing work based on technical areas of expertise. All the database developers could sit together, and have a single line manager who can then adequately evaluate and reject their holiday requests. That made management easy, but it made starting to work difficult — analysts were too busy dealing with other projects, and most days testers could easily turn off the QA environment and go out for a long lunch. Cross-functional teams solved that problem, so people can be busy all the time. Five guys can safely be imprisoned in the basement to develop an obscure C++ API and forgotten about for a few months. Everyone knows they wouldn’t use a shower even if it was available. Timmy from the third floor can be the only person maintaining the all-important interest rate calculator, and guarantee his mortgage payments for at least a few more years. They don’t compete for the same management resources, client time, or anything else that would prevent them from starting to work. Teams can and will move at different speeds, even when working on the same customer deliverable. Problem solved!
Well, perhaps not… Remember the second type of dependencies, those that prevent work being actually finished? They get overlooked. People can start working in parallel, but they often can’t deliver independently. Timmy can change his calculator, but without the server API, the clients cannot use it. When teams depend on each other to actually ship stuff, the slowest team determines the speed of the entire pipeline. That pretty much guarantees that the majority of people will either sit idle for most of the time (heresy again) or they will be pushed to start some new work. Product managers then have to run several parallel streams of work, so it’s easy to keep them busy as well. Parallel work makes the organisation accumulate a ton of almost-finished stuff, that isn’t exactly ready to go live, so there are even more dependencies to manage. Slow teams suffer from ever changing dependencies, and have to rework things many times before releasing. Faster teams often have to go back and redo things that were supposed to be finished, but it turns out were not 100% complete. Effort breeds more effort. This is the software equivalent of building deep fragile tunnels through sand. But it’s all incredibly efficient, and allows organisations to keep many people busy at the same time, spending a great deal of effort. And we’ve already established that great effort makes great stories. Managing The Last War of the Ring is a lot better for career progression than booking an air-eagle ticket.
I strongly believe that starting from the other side of dependencies is a lot more effective long-term. Imagine for a moment seven teams, all interdependent and with all the possible excuses why they couldn’t release software on their own. Their QA platform is built from unobtainium. They have legacy software components that are difficult to package and install automatically. The last person who knew how to configure the database died in Stalag Luft III. The architecture is based on Stanley Kubrick’s black brick from 2001. But let’s for a moment consider that these are problem statements, not conclusions. Instead of scaling the process so that all seven teams can start working on production-bound software, they choose people for only one team that can ship end-to-end. That team gets exclusive access to the priceless QA environment. They get the complete product management attention. And they start running. Yes, I know, they can’t slay the whole big bad wolf on their own fast enough, but they can at least release stuff without waiting for anyone else, so the customer feedback loop get closed. Instead of working on customer features, the other teams form so they can start addressing some of those reasons that keep independent shipping impossible. They investigate configuration management and automate it. They set up another QA environment, and the universe does not end. They automate critical tests. Some components get pulled out of the monolith architecture, and get ready for independent deployment. As the entanglement around shipping parts separately recedes, more teams can be reformed to join and work directly on client software. In a few months, all those unsurmountable problems will be gone. And during that time, the single team that was actually shipping will probably achieve more than the entire previous group anyway, just because they didn’t have to wait for anyone to deploy.
If deployment dependencies can get reduced, the excuses for large bureaucratic processes, release trains, QA cycles and all those fantastic effort-generators on just disappear. Oddly enough, when those dependencies are addressed, the work-starting dependencies seem to magically vaporise as well. Companies can easily find ways to keep people busy. So the one thing to remember from all this, if you’re thinking of restructuring the process in a multi-team environment, is to deal with the far end of the cycle first. It’s not intuitive, but it’s highly effective. At the end of the day, the effort the whole group can put in is always the same — it’s determined by the number of people and the available time. We should really try to improve our processes on the scale of outcomes.