My challenge to come up with ways to visualise quality has, thankfully, caused quite a lively discussion on several mailing lists and forums. Here is a bit of a summary of where the discussion is going.
First of all, thanks to everyone who replied with ideas, even if they were in the category of “it’s impossible” or “it’s intrinsic”. I received quite a few good ideas and I think we can learn a lot from them. But let’s deal first with the two nihilistic categories. I don’t believe that visualising quality is impossible - it is just difficult. If it were easy we’d all be doing it already, but it is hardly in the same category as flying through space faster than the speed of light. As for the other category, with comments like “it’s between the user and the system, we need to put something in front of the user” etc I disagree. Quality isn’t only between a user and a system, it is between those two and also the people who are charged with delivering software. And I also disagree that the only way to see quality is to put something in front of users. That is like saying let’s build a bridge and see if it will fall down when people try to cross it. If we’re going to build a bridge that is successful, we’d better understand who is going to use it and how, what they expect of it and align their expectations with what is realistic to deliver. I’m sure we can engage the clients and users of our systems in describing what they want, but it is up to us to engage them in the right way.
David and I recently started asking key stakeholders for the three most important attributes of their system when we do process reviews. At one company, out of fifteen or so people we got about thirty different answers. These people know what they want, but their expectations aren’t even aligned amongst themselves, let alone with the development team. There is no software team in the world that can deliver anything to satisfy such a diverse set of “key” expectations, so no wonder they had tons of quality issues. And this project had quite a lot of analysis over the years and this knowledge exists, but it is hidden in various specification documents and issue tracking systems. This is where I believe that visualising can help. A good visualisation technique should show business stakeholders this misalignment effectively and support them in sorting it out. Remember, a picture is worth more than a thousand word documents.
Many people rightfully suggested that visualising the beast isn’t as important as deciding what the beast actually is. This is similar to the classic “planning more important than the plan” argument and I partially agree. Engaging people better to define what they mean by quality and what they expect is probably more important than visualising it, but visualising is important for guidance of that process. A good visualisation technique implies that we visualise the right aspects of our systems, but it should also spark a discussion. That is why UML diagrams were successful - they spark a nice discussion around a whiteboard. Lisa Crispin had similar experiences with theme mind-mapping techniques. She says: “Drawing on the whiteboard prompts us to discuss issues related to quality… So that is facilitating our thoughts about quality and helping us move towards our goal of delivering the best quality product.”
In The Design of Design, Fred Brooks argues that the success of Waterfall is largely due to the fact that it is visual, so people could understand and communicate it and also clanged on to the image. If this can work for a model that is completely flawed, I’m sure it could also work for stuff that is much more important. Making quality visual will allow us to understand and communicate it, and if we get it right then people will cling on to that image while they develop their systems. Anyway, on to the ideas.
Visualising process effectiveness
Several suggestions were around visualising development progress, such as task boards etc. I think this area is explored in enough detail in many good books, so I won’t focus too much on it here. But the responses were interesting as they showed that many people clearly think that the quality of the development process is tightly coupled with the quality of the resulting product.
Visualising fragility
Erik Petersen suggested visualising a lack of quality with bug clusters and most checked-out modules (see his talk on building software smarter). Instability of the source code seems to suggest for many people that a piece of code is of low quality. This might be an interesting area to explore, but I think that we need more proof. If a piece of code is often changed, that might mean also that the business is investing heavily in that area or that they are experimenting, which doesn’t necessarily mean low quality.
Brandon Carlson used a heat map that was generated from version control history merged with defect information, showing files that were associated with most defects. He worked on this with Tim Ottinger, who wrote a blog post about it. As potential enhancements to the technique could be associating source code files with system components so that the heat map can present something more meaningful to the business users.
Visualising test results/coverage
Markus Gärenter suggested using a “lightweight testing dashboard” technique by James Bach. This shows component coverage with smiley faces to illustrate confidence in various parts of the system. For more information, see Bach’s presentation, Marlena Compton’s experience report and Del Dewar’s blog post.
Adam Geras used a dashboard based on the one-page project manager format, and wrote a blog post about it.
Last year at Agile Cambridge I picked up a technique from James Whittaker from Google, who talked about the attribute-component-capability matrix. (watch this video,especially after 27th minute, and slides from Whittaker’s talk on the topic). This matrix charts out what is important from a business perspective (attributes), what system capabilities support individual attributes and which components are involved in providing which capabilities. This provides the Google Chrome OS team with a quick overview of where they need to focus their effort with manual testing. I imagine that combining annotation ideas and automated test result processing this could be a live dashboard of system health.
Anders Dinsen wrote about a colleague that used simple traffic light visualisation based on test execution results for a large scale integration project, showing whether the components actually work together. He wrote: “You will probably be surprised that developers on the individual systems didn’t ensure these tests would work, but my colleague decided to run the tests in her own team (quality control!) and visualize the results. It caused quite a bit of political stir in the project and they even had to define degrees of not-working ;-) But it did the trick.” Dinsen also suggest visualising results of usability testing, with the warning that “you need to know what you’re looking for”. As some examples to visualise, he suggested how long it takes users to find a certain function, percentage of times users pick the wrong function etc.
Clare McLennan uses coverage heat-maps. She explains: “This is a 2d map of the system areas where the size of the circles represent complexity of each area (if you like the number of tests we should have), and the colour of the circles the coverage we feel we have achieved (green==good to red == poor). Alternatively you can indicate the coverage by what portion of the circle is filled”.
Adrian Gan used a similar approach, pointing to Jeff Patton’s article about visualising quality. He says that “it was a simple and light weight manner for the team to asses the quality of testing”, also warning that “the challenge starts when trying to determine what is deemed to be enough and the trade offs the team is willing to make base on the stakeholders decisions as these discussions can be very subjective”
McLennan also suggests that a more useful thing to visualise would be risk, not just number of tests. I’m working with a client at the moment on a similar technique, aiming at showing coverage, test status and risk at the same time. We are discussing annotating (FitNesse) tests with tags that would relate to combinations of features (eg asset class, type of booking) and then pull results from FitNesse test runs to show what is actually tested, where are failing tests and overlaying that on top of a simple calculation of risk based on the same attributes.
Visualising test trends
Dinsen also suggested using bug trend graphs. He says: “They are useful showing some kind of progress to the developers who sometimes get preoccupied with focusing on the technical aspects of the system. Use with caution!”. At Agile Cambridge, Whittaker showed a screen of developers’ avatars and images of bugs attacking them, to display which developers have lots of bugs assigned to them.
Lisa Crispin’s team had a “build calendar” that showed each day as red, green or yellow depending on how the continuous builds were going. She says “the business people really paid attention to the calendar. If they saw two red days in a row, they came over to ask what was wrong”.
Visualising technical debt
Lisa Crispin suggested visualising technical debt, another way of describing a lack of quality. She points to the work of Israel Gat & Jim Highsmith on this. Her team graph their legacy code vs. new code (“the new code all developed TDD, and has very few bugs in production”). Crispin said “For us, this reflects progress towards higher quality and less tech debt”.
Visualising production performance
Anders Dinsen also suggested visualising production performance: “On live systems, you can visualize response times, down-times etc. Live systems are tested 24x7 by real users and produce a lot of useful knowledge!”.
Erich J. Zimmerman worked on a web application that managed assets with complex relationships across hierarchies of markets, with data integrity as a very important aspect of quality. They created a heat map that visualised data integrity with “asset types down and markets across” in production. Zimmerman says “Each spot on the map indicated a simple red/yellow/green on whether there were issues in that market for that asset type. This allowed users to get a quick snapshot of the overall health of the system, and also to correlate issues across a market or for an asset type. Each spot then linked to more detail on that market/asset combination: which tests were failing, for which assets… You could see what was going on, then dig down if you had to.” One of my clients did something similar recently - their back-office users often complained about problems with article publishing queues which were unreliable. They would often publish the same article several times because they didn’t know if it was not appearing because it got stuck in the queue or if there was a problem. Once we understood that this aspect of the system is important, the team hey bought a monitor to show queue status and queue processing time for the most recently published article. This helped business users understand when they can expect to see something out and also alerted developers to potential problems in that part of the system before they became too big.
Mapping out scope of work
Lisa Crispin’s team uses mind-maps extensively for mapping out themes and facilitating a discussion around what to test and what to focus on when delivering. I’ve found impact maps very good for visualising scope and getting everyone to agree what are important aspects of an upcoming block of work.
Jesper Lindholt Ottosen sent a few interesting links on mindmaps for testing: Shiva Mathivanan’s post on test reporting using mind maps and a discussion on mind-maps for testers on the software testing club forum, which contains lots of interesting ideas and many more related.