When TDD goes bad

Last week, at the London .NET User Group meeting, Ian Cooper talked about Test-driven development, focusing on both good and bad practices. I’m a big fan of learning from anti-patterns and mistakes of other people, so the second part of his session was very interesting to me. Here is a short list of things that Ian identified as symptoms that TDD has gone bad in a project, along with my comments:

  • Disabling tests to make a build pass:

    If the build is failing because of a test, developers disable the test to make the build pass. This is a bad practice because tests become irrelevant or get lost — people don’t remember to fix and enable them later. If the test is deprecated, then delete it completely. If it is not deprecated, don’t disable it but make it pass.

  • Continuous integration not failing when unit tests fail:

    The point of continuous integration is to automate problem checking and prevent a big-bang integration before a release. Broken unit tests should raise an alarm and get fixed before problems pile up. If the CI server does not break the build when a unit test fails, then CI configuration must be changed.

  • Not monitoring customer tests:

    Ian put integration and acceptance tests under the “customer tests” group. These tests don’t break the build because they will not pass for most of the development, but they still might go down from 30% to 20%, for example. That is a sure sign that something bad happened, yet if nobody is monitoring the reports, this will again lead to a big bang integration on the end. In my eyes, integration tests and customer tests should be split into two parts: the first one (integration) should break the build, and the second one (acceptance) should not, but it should still be monitored.

  • UI change causes tests to fail (interface sensitivity):

    If tests depend on the UI heavily, then they will be brittle and hard to maintain. I wrote about this earlier in Effective User Interface Testing.

  • (3rd party) API changes cause tests to fail:

    If changes to 3rd party APIs propagate to tests, then tests are again hard and expensive to maintain. I guess that the bigger underlying problem here is that the business logic is not isolated properly from 3rd party libraries.

  • Many tests fail when a single behaviour changes:

    This applies to unit tests, and signals that tests are not properly granulated and focused on code units, but try to test too much. Again, the issue arising from this is high cost of test maintenance.

  • Data-sensitive tests:

    Tests that depend on some data pre-conditions (such as certain records existing in the database) are also brittle and will break when the data changes. Test harness should ideally set up all the pre-conditions for a test. A telling sign of this are tests that use hard-coded database IDs.

  • (Shared) Context Sensitivity:

    If tests depend on other tests to set up the context, then the order of test execution becomes important and you can no longer run individual tests in isolation. This can lead to big problems, especially if the test runner does not guarantee the order of tests. Again, the test harness should ideally set up all the pre-conditions for a test and individual tests should be independent.

  • Conditional test logic:

    The telling sign of this problem is that tests choose validations based on run-time context (if (…) test this… else test that…). This signals that the test is not clearly focused on a single thing, and that the author does not really understand what he is testing.

A quick summary

Tests that break without anyone reacting do not prevent problems from piling up. This defeats the point of tests being a traffic light that keeps the problems small and prevents a big-bang integration on the end. When tests fail, alarm bells should ring.

High maintenance cost of tests defeats the whole point of having them, as a way to guarantee that code changes are easy and cost-effective. Keep unit tests independent and focused on a single code unit. Then they will be easy to maintain and will support change rather then inhibit it.

Image credits:Lorenzo González

I'm Gojko Adzic, author of Impact Mapping and Specification by Example. My latest book is Fifty Quick Ideas to Improve Your Tests. To learn about discounts on my books, conferences and workshops, sign up for Impact or follow me on Twitter. Join me at these conferences and workshops:

Specification by Example Workshops

How to get more value out of user stories

Impact Mapping

8 thoughts on “When TDD goes bad

  1. Nice collection of testing caveats. However, it should be noted that none of them relate to TDD specifically.

    If you ask me (and you didn’t), the #1 way in which programmer tests go bad is when the programmers don’t use TDD.


  2. Use of a policy based configuration management system can go a long way to preventing symptoms 1 & 2 at the very least. An example of this is Aegis (http://aegis.sourceforge.net/).
    In the case of symptom 1, it has policies which can prevent a change-set from being integrated which does not pass the regression suite. It can go further and insist on a change-set being accompanied by a new test.
    In the case of symptom 2, committing a change-set to the repository has an additional integration step which performs a “clean-room” build and another pass of the tests. It replaces “continuous integration” with “at-commit” integration. Instead of finding out that the compile was broken overnight, the breakage is detected BEFORE changing/damaging the repository.

  3. Very good Post. You have identified the problems in the parts. But following a lean approach, I think the reason why TDD sometimes fails is exactly because it focuses on the parts, and not in the big picture.
    I think TDD has to evolve to embrace some lean management techniques to be (more) successful.


  4. Hi Scott,

    1 is a human problem, not a code problem. No integration system is going to prevent a developer from commenting out the offending verification line, which is what happens most often.

  5. Very good points – all of them. Even if none of them are related to TDD.

    Maybe you could add one last point:

    == Developers think that TDD is only about Testing ==
    There is much more to TDD than testing. First of all, in TDD you write the test *before* you write the code.
    If you’re not doing that, you’re not doing TDD. Second, the rythm between writing a test and production code
    should be under a minute (never more than a couple of minutes). And third, you should never write any production
    code until you have a failing test.

    Most developers failing with TDD are ignorant of these guidelines.

  6. Diego’s comment (while apt) is exactly what I’m talking about. Too many people think TDD == unit tests.

    It would be great if folks wouldn’t confuse the two.

    Just in case it’s not clear, TDD refers to the practice of using tests to drive development. You don’t write a line of code without a failing unit test. This means you write your tests before you write the code. This has several benefits, including 1) being much more interesting than writing tests afterwards, which I find mind-numbing, 2) forcing you to define your interface before you start coding, 3) reducing feature-itis, where you might add stuff “just in case,” and, oh yes, 4) ensuring greater test coverage.

    Clearly, gojko’s post is about testing in general, not TDD.


  7. Nice list and I think it’s really a study in theory vs. responsibility. This may sound picky, but I think it’s important – this list applies to Unit Testing, not TDD.

  8. Hi gojko,

    Agreed, it is a social problem and while technology will never cure a social problem completely, the right tools can help by making it more difficult to cause the problem than to fix the original issue.
    I give the example of Aegis because it provides/imposes a development process designed to alleviate just this situation. In the case of a single developer disabling a test, this would be stopped at the next stage of the process which is a review. Aegis won’t allow the change-set to progress through to a repository commit until the review is passed.
    This means that at least two or more people (developer and reviewer) have to agree to knowingly sabotage the project by omitting the test.
    If a development team has reached at point where testing has become this expendable then it has a far greater problem than a single bug being introduced.


Leave a Reply

Your email address will not be published. Required fields are marked *