James Whittaker, test engineering director at Google, talked yesterday at the Agile Cambridge conference on how Google does ‘Test Engineering’. He likened software testing to healthcare – in particular patient care in hospitals.

Whittaker started by saying that software development was like manufacturing 20 years ago, and that the cost of fixing problems after a release was much higher than before software was released. “We don’t do software like this any more”, said Whittaker, adding that “You could be using Google docs and it will update under you and you won’t even know it”. Whittaker concluded that the amount of time it takes to fix a bug before and after shipping for their software is now absolutely the same (note “amount of time”, not cost).

Whittaker suggested looking at a software system as a patient in hospital care. “Testing is like healthcare, an ongoing process”, said he, adding that effective hospital care requires physicians to quickly assess the state of a patient. For that, they use patient charts, which give them a history of disease and treatment, and monitors, which give them visibility over key vital signs such as heart rate.

“I don’t have a clipboard to tell me where are the problems and the life support monitor”, said Whittaker. As a test engineering director, he said that this information was available to him but in several SQL databases which he had to query manually, so he set his team the task of making him run less SQL queries.

As a result, the development teams at Google have introduced several tools which provide the functionality of the patient charts and health monitors. (But have luckily avoided the “get the most expensive machine, in case the administrator comes problem”).

Attributes-Capabilities-Components

The company policy is that anyone can look at any test cases regardless of whether they own the related product or not. They created a tool called Testify that makes statements about the health of their products based on test cases and execution results. It also assists them in providing a historical view of diagnostics.

In order to provide a statement about a software product, they realized that the information needs to be structured and organized. “Doctors have a huge advantage – all their patients look the same”, said Whittaker. To make a meaningful statement about software health, they started listing desired attributes of their products. For example, some of the desired attributes of Chrome OS is that it is stable and secure. “Any time a sales person uses an adverb or an adjective we create an attribute”, said Whittaker. These attributes are then compiled into a tag cloud to provide visibility of relative importance of different attributes.

For each attribute, they list relevant software components that can affect it. This took them about 20 minutes for the Chrome OS. After that, they listed capabilities, starting from verbs that people used to describe the functionality. For example, capabilities of Chrome OS are “connects to internet”, “renders a web page”. It took them about two hours to list 304 capabilities of Chrome OS. Being surprised that there are not more capabilities, they ran the list through a review process and ended up with 314 capabilities in total.

Capabilities, components and attributes “describe our patient”, said Whittaker. According to him, Google is very specific in that it has a very low ratio of testers to developers. “Rich” teams have one tester for four or six developers, the typical ratio is one to fourteen and some teams have one tester for twenty eight developers. He said: “If a developer wants something tested, they have to ask nicely and be prepared for No. The test team needs to understand what to test and you have to make sure that everything you code has to have a set of capabilities attached to a component attached to an attribute”. They assign risk on a capability/attribute matrix and the risk decides where test resources are involved. As a result, developers pay a lot of attention to the quality of their products.

Using “Tours” as test strategy patterns

Similar to the way doctors use classes of treatments (ie vitamin therapy or chemotherapy) to discuss a whole set of actions with their colleagues and patients, Whittaker advised using “tours” to describe test strategy patterns. He gave several examples of “tours” they use in Google, such as the Back-alley tour (visit places where very few people have been before), the All-night tour (run the software continuously overnight in a lab, without shutting down) and the Scottish pub run tour (take over the system completely for testing). These tours are a high level pattern language for both activities and diagnostics in a testing session, without going into the low level details. They allow the team to discuss meta issues and communicate more efficiently.

Heads-up displays

A monitor in a patient room tells visiting doctors everything they need to know about vital signs of a patient. They do not need to exit the room to find out the details. Whittaker said that video games are great examples of how heads-up displays show complex statuses on a screen in a very usable form. “They [players] are not writing SQL queries during a game”, said Whittaker, arguing that testers should not be writing SQL queries to get vital software signs as well. He then showed several examples of systems that they have developed at Google to show vital signs of their systems.

One visualisation screen shows the number of unit tests per module per developer, enabling him to see quickly who is writing unit tests and who is not. As a result of this visualisation, the testers started to reward good developers through a peer bonus system and stimulate them to be better in writing automated tests.

Another screen shows who fixes bugs and who doesn’t. They created a video-game like screen with developers’ avatars at the top half and bug-like creatures at the bottom screen, representing logged bugs. As soon as a defect is assigned to a particular developer, the related bug moves from the bottom part of the screen to the top half and starts attacking the avatar of the related developer. When the bug is fixed, the developer avatar kills the bug. This visualisation allowed them to quickly spot the developers that have lots of assigned open bugs. Putting this “game” on a screen where developers can see it caused the average time a bug stays open in their system to go down significantly.

So is testing like healthcare?

At the end, Whittaker concluded that testing is a “continuous ongoing activity that mirrors the way physicians tour hospital wards”.

My main objection to this talk is using a metaphor for software development that can only be stretched so far. I don’t know why software development always has to be like something else and why people can’t just take it for a practice in its own right. The attitude that software is always sick and in patient care doesn’t really fit well with my experience. I’ve seen quite a few sick systems, but lots of very healthy ones as well – and they both required testing. I was building software for fun, not for money, 20 years ago, so I cannot really talk about what it was like in the industry from experience. However, my understanding of the situation then was that software development was never like manufacturing.

I do however see a lot of value in learning from other fields, including healthcare, and applying relevant lessons learned there to software. Exposing system characteristics and providing more visibility is definitely a good idea.

Read a related post on Improving test practices at Google.