Potentially shippable software is the holy grail of agile delivery, according to anyone out there with enough patience to sit through two days of Scrum conditioning. Ten years ago, most of the industry probably wasn’t capable of living up even to that benchmark. But today, potentially shippable software by the end of each iteration should be taken for granted, the same way you expect your next hamburger to be asbestos-free.  That’s the bare minimum, but far from being good enough.

In fact, the current thinking around potentially shippable software severely limits what teams could achieve. The move to frequent releases is causing a fundamental change for consumers. Companies that can spot this in their market segments, and adapt quickly, will start running circles around the competition. Those that don’t will be left trying to play bowling on a basketball court.

For an example, just look at transportation. The entire car industry seems to shake and tumble with problems. The data for 2015 isn’t out yet, but just for comparison, Toyota recalled 6.5 million cars last year to deal with switch malfunctions. In 2014 alone, the recalls ordered by the US NHTSA agency involved 63.9 million vehicles. General Motors had to fix 5.8 million vehicles in 2014 to deal with faulty ignition switches that could cause fires. In a similar situation, NHTSA ordered Tesla to deal with problems in almost 30.000 vehicles. The NEMA 14-50 Universal Mobile Connectors could overheat, and potentially cause a fire. Pretty serious stuff, with faulty hardware. Judging by the rest of the industry, this should have been a crisis that would cost at least a few million. Instead, it turned into a ton of free press. 

The Tesla UMC recall didn’t require car owners to waste time driving their vehicles to a service shop. It didn’t require the manufacturer to cash out for new parts, or to pay for mechanics’ time. Instead, someone pushed a button. An over-the-air software update remedied the problem until the next time a car gets brought in for a regular check-up.

When one player in a market can respond to a major problem with an automatic software patch, while the others have to pay for parts and labour, they aren’t playing the same game any more. The costs of servicing, of course, plummet. But the impact goes far beyond that. 

In similar situations, the owners of the other vehicles had to make a difficult choice of trading their short term plans against security and safety risks. Tesla’s customers were not affected at all. For Tesla, continuous software delivery isn’t just a technical practice, it’s a way to change consumer expectations and open up marketing opportunities.  You can dismiss this as an isolated incident, but ten years from now, the car consumer expectations will completely turn. People will expect to have that level of service, and anyone not being able to provide it will be out of business. And it will happen to many other industries as well. 

Disrupting business models

When the user expectations and perceptions change, business models have to change as well. Just consider how the typical software sales models changed over the last twenty years. Before the web services became ubiquitous, it was quite normal for consumers to buy a particular version of software. When a new box of their favourite software came out, complete with a stack of 3.5” floppy disks and a printed user guide, consumers would pay for upgrades. This model was sensible when new versions came out every year. But as the Internet took off, and people requested higher bandwidth to enjoy ever increasing quality of funny kitten videos, it became possible to distribute software updates more frequently. 

Consumer expectations significantly changed. Technically, it makes a lot of sense to release software several times a month, especially to fix security risks. So people got used to upgrading frequently. Consumers might even like the new features enough to want to install a new version, but nobody wants to pay for software that often. The whole concept of selling versions didn’t make much sense at higher frequency, so companies started to offer free upgrades, and users started to become more entitled. The average internet consumer today expects to get web services for free. Free e-mail, free photo storage, free news. On mobile platforms, apps still sell, but users who pay $0.99 expect to get all future new features free, forever. That requires literally a pyramid scheme where early backers benefit from latecomers, and requires an ever increasing user-base. When the growth stops, commercial models like that fail, much like in a Ponzi scheme. 

Just look at operating systems. After OSX went free, the game changed. Microsoft had to make Windows 10 free as well, and move away from versions. Now all Windows will be 10,  and instead of a major version every three years, consumers can expect a continuous stream of updates. At the same time, because it can’t be sold any more, Windows collects private information and phones back home with advertising identifiers, so it can make up for the lost revenue

Changing the expected delivery frequency pretty much killed the old software business model. Instead of charging for new features with paid upgrades, software companies had to come up with completely different ways of financing development.  The wholesale sleaze-ball privacy invasion of ad networks is just a way to pass the need for payments from consumers to third parties. Some companies decided to constantly harass their users with micro-payments to unlock individual features. Zynga was one of the first to spot the game change, and it was one of the rising stars at the turn of the decade. But consumers can only suffer constant harassment for so long, and it took only a few years for the whole pyramid to collapse. On the other hand, companies that could charge for rent, such as Dropbox or Github, flourished in the new game. The expectations in the market changed. People want software for free, but they seem happy to pay for a service.

This user entitlement caused by more frequent delivery expectations isn’t a problem just for software. As continuous delivery crosses more into product strategy, it starts to affect customers of all types of products. In October 2014, Tesla announced that new cars coming out of the factory will have a forward radar, ultrasonic sensors and cameras, all wired up to a lane-changing autopilot and high precision digital breaking system. Although the news was amazing, not everyone was happy. Richard Wolpert from Los Angeles, for example, bought an older model just a few months before, and to him the world just seemed unfair. Normally, he got the new car features for free, magically. But this one wasn’t coming. So he started a petition to force Tesla to retro-fit radars and sonars into older cars. Dag Rinden of Oslo, Norway, pleaded that lane switching and automatic breaking are important for driver security, and that they should be provided for free to all existing owners. 

Now let's just take a moment to consider this. Someone bought a car, and later complained that new hardware did not magically appear overnight when it was announced in the news. Continuous delivery doesn't solve this problem, Star Trek replicators do. You and I can laugh about it, because we can distinguish hardware and software, but Richard and Dag don’t care about that. They only see a car, and they got used to getting the new stuff for free. Plus, the new features are potentially life-saving, so surely they are entitled to those. Disappointing users is never good, even when they are clearly wrong. But giving away free radars also isn’t good for business. And the whole mess is a consequence of the fundamental game change. 

I’ve never heard of anyone with similar complaints about any other car manufacturers. When you buy a car, pretty much it's clear that it won't one day just get a radar and a sonar. But Tesla trained their users to expect more. They aren’t playing the same game. For all other manufacturers, a car model is something with a fixed design, produced a particular year. For Tesla, that concept of car models just doesn’t work like that.  And once there are no more models and yearly versions, people just feel a lot more entitled and expect to get things for free.

Disrupting marketing

Another major side-effect of frequent delivery is that it removes the drama. The more often software ships, the less risky each update becomes. Small changes mean quick testing, and small potential problems. Continuous delivery pipelines help companies deploy with more confidence, prevent surprises, and generally make releases uneventful. But making the releases uneventful also causes problems for marketing. I’ve learnt this the most stupid way possible, on my own skin.

MindMup is a bootstrapped product, and we don’t have a lot of cash to spend on advertising.  Apart from slow and steady word of mouth, the typical way for such products to get new users is press coverage. Indeed, the three biggest spikes of user traffic for MindMup over the last three years came from news sites — spending a day on the front page of HackerNews after we open sourced it,  and getting reviewed on LifeHacker and PCWorld.  However, after those early successes, it took over a year and a half until we could get another big spike. Meanwhile, we shipped a ton of useful stuff, but nobody took notice. By having a continuous stream of small changes instead of big versions, we scored a marketing own-goal. Sure, technically there was no drama in any of the hundreds of releases. But there was also no excitement. No single change was ever big and important and newsworthy to be covered by a major channel. 

Potentially Shippable in a changed game

Changed consumer expectations, across industries, will put more pressure on companies to roll out software with increasing frequency. That’s a given. Yet the more frequently software ships, the more it has an impact on marketing, business models and consumer expectations. People who design continuous delivery pipelines, and people who break down features into iterative deliveries, now have a magic wand that can disrupt sales or marketing and disorient users. 

A nice example of that is how Paypal changed their business dashboard last year. One day, trying to pay something using PayPal, I panicked after several thousand pounds disappeared from our company account.  My first thoughts were that our PayPal account got hacked, or that the funds were frozen for some bizarre reason. PayPal is famous for being hostile to digital goods merchants, and frozen funds were an even scarier scenario than getting hacked. I looked through the recent transactions, and I couldn’t see any transfers or withdrawals. In fact, there was nothing suspicious in the list.  While I was trying to call the customer service, I spotted a link saying something similar to ‘how do you like our new business dashboard?’ Anyone who has ever done serious software testing would start guessing what happened there. And in fact, it took only three link clicks to find the money. My company has a multi-currency account with PayPal. The old dashboard converted all the money into an approximated value in the primary currency, but the new dashboard only showed the money actually in the primary currency. Someone did an incremental development change, and they either intentionally or mistakenly disregarded multiple currency accounts. I can only assume that most people with multiple currency accounts didn’t think like a software tester that morning, and that the PayPal customer service didn’t exactly have a pleasant day. At the same time, to get software potentially shippable, someone had to cut a huge piece of work into smaller batches. And they made the wrong choice. 

As an industry, we need to move the discussion away from ensuring things are potentially shippable towards how exactly that’s achieved.  The choice can have a ton of unexpected negative effects on sales and marketing, or it can open up new business opportunities and help companies run much faster than the competition. That’s why software planning and releases have to be driven more by market cycles and marketing opportunities than arbitrary iterations.  

And that’s where the problem with the concept of 'Potentially Shippable' starts. Does that mean potentially could be deployed to production? Or does it mean potentially could be released to users? Who determines if potentially should be turned into actually? Or when that should happen? 

Deployments are not the same as Releases

When we started fixing this problem for MindMup, one thing became painfully clear. We thought about deployment and releasing as the same thing, but it's much more useful to look at them separately. Deployment is a technical event, bits and bytes of software being moved to production servers or users’ devices. Release is a marketing event, where a new version becomes available to a group of end-users. Think about ‘Deployment’ as the part when an Amazon courier brings a box of cardboard-packed toys, you wrap them up nicely, and hide them in a cupboard. But  ‘Release’ is when your children find the toys under a Christmas tree, at exactly the right moment to believe in Santa Claus.

For MindMup, potentially shippable stuff turned into actually shipped almost all the time, in order to reduce technical deployment risks. We mentally coupled deployments and releases, and by doing that, we forced a technical event to have a marketing impact. Going back to the example with presents, it’s as if the children intercepted the couriers and took the presents themselves, along with the delivery slips and the receipts. Sure, at the end everyone got a toy, but the magic of Christmas is gone.  And they’ll start arguing about who got a more expensive present and who got shorted. Our software releases were driven by technical cycles, not marketing cycles. No wonder nobody wanted to pick up on any important news.

Once I could spot this in our software, it became easy to see it with many of my consulting clients as well.  I don’t have any statistically relevant data to claim an industry-wide pattern, but it looks as if this is quite a common self-inflicted handicap. Deployments and releases are tightly coupled in our minds, it’s just the way we were conditioned to think. I assume that nobody reading this article primarily distributes software on floppy disks in boxes, physically shipped to consumers. Yet that’s still how most people think about releases and deployments.

The solution is quite simple: Decouple deployments and releases. This effectively means being able to put software on production systems that is not necessarily generally available, running alongside software that is visible. It’s the nicely wrapped present, without the receipts or any other controversial crap, waiting for the right moment to make a big impact. That way, the marketing stakeholders can decide on their own when they are going to release it and how. Software releases can be organised around important marketing opportunities, while software deployments can still happen frequently to reduce technical risk. Jez Humble wrote about that in 2012.  

The key is in multi-versioning

The problem, of course, is that simple is not the same as easy. Although I can suggest the solution in one sentence, it is quite difficult to pull off in practice. Feature toggles, ever more present in software, lead to unmaintainable spaghetti of code, configuration, and magic. To truly get the benefits of continuous delivery, most companies will likely need a completely different approach to technical architecture and design. Instead of simple toggles and flags, software will need to be designed from the ground up for multi-tenant, multi-versioned, multi-interface world. This means that every layer of the stack will need to accept calls from potentially different versions of things above it and know how to reply accordingly. It also means that almost every piece of data in transport will need to be tagged with the appropriate version. This will significantly increase the complexity of testing and operating software. But companies that don’t do that will end up playing bowling on a basketball court and wonder why they are not scoring.

Once the capability for running multiple concurrent versions is in place, it’s becomes quite easy to make some versions of software available only to certain subsets of users. And so, it becomes easy to minimise the potential negative effects of small incremental changes. Imagine if the new Paypal business dashboard was only shown to customers with a single currency account. Instead of giving all the users a small increment of the improvement, this would give a small group of users 100% of what they need. There would be no user confusion, and the “new” business dashboard would actually be better for whoever could see it. Over time, as the features build up, more users could be brought over to the new system, and then finally, the old version completely retired. Ironically, I’m pretty sure that PayPal has the capability to deploy and release gradually to subsets of users, but they didn’t coordinate it well with the rest of the business.

Once the capability for running multiple concurrent versions is in place, it’s becomes much easier to decide what and how to sell, and what, how and when to open up. Continuous delivery pipelines don’t need to have a negative impact on sales or marketing, and the decisions around those aspects can go back to the people that should be making them.  Even more importantly, with proper multi-versioning in place, it becomes a lot easier to make better informed decisions. Focus groups, prototype experiments and customer research can only suggest that people might potentially be able to do something, not that they will actually do it, or get the expected benefits. But with multi-versioned systems, companies don’t have to rely on potential usage data — they can look at actual, real user trends, and weed out bad ideas before they become cemented.  At Google, one such test apparently led to an extra $200m a year in revenue

Ron Kohavi, Thomas Crook, and Roger Longbotham have some chilling statistics in their paper Online Experimentation at Microsoft, where they claim that only about one third of analysed ideas actually achieved what was expected once implemented in software. They also cite a source from Amazon, where the success rate is higher, but still less than 50%. 

This means that for the average software company out there, getting multi-versioning right can reduce maintenance costs by fifty to seventy percent, just by helping them drop deadwood, and not waste time on implementing things that just won’t fly.  The additional cost of operation and testing can then be easily be recovered through a significant reduction in maintenance costs. 

So, if you’re still late making your 2016 resolutions, or if all the ones you made already turned out to be unachievable, here's an idea for the next year: push your organisation slightly more towards thinking that continuous delivery isn’t just a technical thing. It’s a game-changer, that has massive side-effects on business models and customer expectations. And design your pipeline so that you can decouple deployments from releases. Run the former based on technical risk, and coordinate the latter with marketing cycles.