Feelings Erased: Perfect from the start?

Thanks go to Kuba Miara for inspiration

This is a follow up post to the TDD is a good teacher - both demanding and supportive and discusses the second argument from my recent debate.

Everything or nothing.

This argument states that you have to be "perfect from the start" or you die. Either you take everything or nothing. It's impossible for someone to take just a part of the benefits TDD has to offer.

Here's my opinion on the topic: what executable specifications (AKA unit tests) bring you is never for free. Unit tests are additional code and like every code, it must be added, removed and changed to reach its target. Moreover, judging from static code analysis of a piece of code I did once, the unit tests code has often the highest coupling, so it's likely to change often. It's just that doing "full" TDD gets you the most. Below is the summary of different levels of TDD adoption (how I describe "progress towards full TDD" is of course arbitrary, so bear with me :-)), together with some rating based solely on my experience. By the way, when I say "isolation", I mean breaking dependencies between tested object and its collaborators, not between tests.

1. No unit tests at all

Many great projects start as toys or experiments. We do them to learn something, to get play around with an idea or quickly to check the possibility of revenues (Kent Beck talked once about doing just this.).

It's quite fun while the design is pretty obvious and verifying the app requires just a few manual steps (like: run the app, click add and see a correct dialog open) - we can just "code and ship it".

Once we get into a situation where the workflow gets less obvious or the logic is hard to verify (e.g. the app sends something via a socket - to check it manually, we'd have to write a test client or plug in a packet sniffer), this gets kind of ugly. The we usually launch a debugger, where we spend a lot of time tracking issues and correcting them (and tracking issues introduced by those corrections) - first using manual scenarios, later black-box level tests that at some point get created.

Also, there are times, when we get stuck wondering "is it better to use inheritance here, or a tricky delegation? Should this method end up in class A or class B?". Sure, there are heuristics to evaluate design decisions, but it's still kind of arbitrary and ends with lengthy discussions and explaining the rationale all over again to each and every person questioning our design.

Build speed
No need for debugging
Executable Specification
Measurable design quality
Protection from regression issues
Confidence of change
Ease of writing new test
Ease of maintaining existing tests
Speed of transition from change in code to successful build
Motivation to specify all code

2. Some poorly-isolated "unit" tests (or rather integration tests, as they're sometimes called)

This level is usually attained when some of the rebellious developers are angry with both time for setting up the environment necessary to perform box level tests and the time of execution of such tests. The idea is to write some of the scenarios dependent directly on entities in the code, to bypass mechanisms such as web or database, still exercising the domain logic. This makes sense when the domain logic is complex of its own.

While shortening the time of setting up and running the tests, this approach makes the build slightly longer (additional code) and requires performing some isolation, at least from external dependencies (system clock, database, file system, network etc.), which, given a small amount of tests written this way, can be relatively cumbersome.

On the bright side, having those tests let's us reason in a limited way about the design of the product, by asking questions like "is it easy to plug out the real database and substitute it for a fake?".

Build speed
No need for debugging
Executable Specification
Measurable design quality
Protection from regression issues
Confidence of change
Ease of writing new test
Ease of maintaining existing tests
Speed of transition from change in code to successful build
Motivation to specify all code

3. Many poorly-isolated (coarse-grained) "unit" tests

This is what many projects end with. We have many tests exercising most of the scenarios in the code. It is safe to refactor, it is easier to see some intent documented by the tests. Because the tests are so many, the relative cost of writing helper fixtures and mini-frameworks to help us is relatively small.

This, however, is the level where the point of having the tests is most fiercely discussed. Because of the low isolation, many tests go through the same paths of the code. This leads to a situation where one change in the code leads to tens of tests breaking from reasons not known outright. Since the tests are usually long and use many helper classes and functions, we have often to debug each test to discover its reason for failure. I've been in a project where updating such test suite used to take twice as long compared to updating the code.

One more thing - the build slows down. Why? Because coarse-grained tests gather many dependencies. Let's say that we have 40 classes in our production code and each of our 20 tests is coupled to 30 of them - any change to any of those 30 classes makes all 20 tests recompile. This can cause a major headache, especially in slow-building languages such as C++.

Build speed
No need for debugging
Executable Specification
Measurable design quality
Protection from regression issues
Confidence of change
Ease of writing new test
Ease of maintaining existing tests
Speed of transition from change in code to successful build
Motivation to specify all code

4. Many well isolated unit tests (first real unit tests)

Someone joins our team who really knows what unit tests are about. Usually this is a person practicing TDD, but is unsuccessful to convince the whole team to use it. Anyway, the person teaches us how to use mock objects and usually introduces some kind of mocking framework. Also, we learn about dependency injection and something about FIRST or FICC properties. With isolation rising, the test suite becomes increasingly maintainable - a situation when two tests fail for the same reason is quite rare, so we have less tests to correct and less debugging to do (because fine-grained test failure brings us to the failure reason without a need to debug).

Everything starts to go more smoothly - the builds speed up, the documentation of intent does not double the one provided by box tests already, new tests are easy to write (since they usually test a single class - no supah long setup and teardown etc.), we can easily reason about the design quality with some heuristics and a simple rule: from two design choices, the best is the one that requires fewer good unit tests.

On the darker side: the team still didn't get the whole idea of test suite being an executable specification and is still stuck on the "testing" level. Because of this, they can discard writing unit tests for some of the implementation as "not worth testing", or "too simple". Also, since the tests are written after the code, this is usually not done with testability in mind, so each time a new part of code is written, it requires extra rework to be able to write good, isolated unit tests for. The last downside: the team does not benefit from all the analysis and design techniques that TDD provides.

Build speed
No need for debugging
Executable Specification
Measurable design quality
Protection from regression issues
Confidence of change
Ease of writing new test
Ease of maintaining existing tests
Speed of transition from change in code to successful build
Motivation to specify all code

5. Test Driven Development (with all the "bundled" tools)

This is the sweetest spot. The "testing" part is treated as a side effect of performing analysis and design with all the powerful techniques TDD brings. The code quality is superb (my experience is that code written using good TDD has often the best quality analysis results of all the code in the product). Code is written with testability in mind, so we don't pointlessly waste time reworking code we have already written just to enable isolation (we DO refactor sometimes, especially when we're doing triangulations, but this is not a waste - this is learning).

In TDD, a failing test/spec is the reason to write any code, so even constants and enumerations have their own specs. There is always motivation to write a test/spec, because there's no "too easy implementation" - there is always "an implementation that does not exist yet".

People coming to this level (especially in languages with extensive IDE support, like Java or C#) quickly become super-productive killers, who quickly notice gaps in requirements, write beatifully designed code, at the same time producing: test suite for regression, an executable documentation of responsibilities and design decisions and a set of examples on how to use each class.

Build speed
No need for debugging
Executable Specification
Measurable design quality
Protection from regression issues
Confidence of change
Ease of writing new test
Ease of maintaining existing tests
Speed of transition from change in code to successful build
Motivation to specify all code

Wrapping it up.

TDD is the most effective of all processes described above. It does not require perfection from the start, however, if you choose to forgo some of the practices and techniques, you obviously have to pay the price. What I try to teach people when I talk about TDD is that it is unnecessary to pay this price and by doing this, we introduce a waste. Which is what we'd like to avoid, right?

Good night everyone!

Feelings Erased

Sunday, 20 May 2012

Perfect from the start?