Monday 4 March 2013

The Two Main Techniques in Test-Driven development, part 2

Today, I'd like to continue where I left last time. As you might remember, the first post was about Need-Driven Development and this one's gonna be about a technique that's actually older and it's called Triangulation.

Triangulation

History

The first occurence of the term triangulation I know about is in Kent Beck's book Test Driven Development: By Example where Kent describes it as the most conservative technique of test-driving the implementation. It's one of the three core techniques of classic TDD.

Description

The three approaches to test-driving the implementation and design described in Kent's book are:

  1. Write the obvious implementation
  2. Fake it ('till you make it)
  3. Triangulate

Kent describes triangulation as the most conservative technique, because following it involves the tiniest possible steps to arrive at the right solution. The technique is called triangulation by analogy to radar triangulation where outputs from at least two radars must by used to determine the position of a unit. Also, in radar triangulation, the position is measured indirectly, by combining the following data: range (not position!) measurement done by the radar and the radar's own position (which we know, because we put the radar there).

These two characteristics: indirect measurement and using at least two sources of information are at the core of TDD triangulation. Basically, it says:

  1. Indirect measurement: Derive the design from few known examples of its desired external behavior by looking at what varies in these examples and making this variability into something more general
  2. Using at least two sources of information: start with the simplest possible implementation and make it more general only when you have two or more examples

Triangulation is so characteristic to the classic TDD that many novices mistakenly believe TDD is all about triangulation.

Example

Suppose we want to write a method that creates an aggregate sum of the list. Let's assume that we have no idea how to design the internals of our custom list class so that it fulfills its responsibility. Thus, we start with the simplest example:

[Test]
public void 
ShouldReturnTheSameElementAsASumOfSingleElement()
{
  //GIVEN
  var singleElement = Any.Integer();
  var listWithAggregateOperations 
    = new ListWithAggregateOperations(singleElement);

  //WHEN
  var result = listWithAggregateOperations.SumOfElements();

  //THEN
  Assert.AreEqual(singleElement, result);
}

The naive implementation can be as follows:

public class ListWithAggregateOperations
{
  int _element;

  public ListWithAggregateOperations(int element)
  {
    _element = element;
  }

  public int SumOfElements()
  {
    return _element;
  }
}

It's too early to generalize yet, as we don't have another example. Let's add it then. What would be the next more complex one? Two elements instead of one. As we can see, we're beginning to discover what's variable in our design. The second spec looks like this:

[Test]
public void 
ShouldReturnSumOfTwoElementsAsASumWhenTwoElementsAreSupplied()
{
  //GIVEN
  var firstElement = Any.Integer();
  var secondElement = Any.Integer();
  var listWithAggregateOperations 
    = new ListWithAggregateOperations(firstElement, secondElement);

  //WHEN
  var result = listWithAggregateOperations.SumOfElements();

  //THEN
  Assert.AreEqual(firstElement + secondElement, result);
}

And the naive implementation will look like this:

public class ListWithAggregateOperations
{
  int _element1 = 0;
  int _element2 = 0;

  public ListWithAggregateOperations(int element)
  {
    _element1 = element;
  }

  //added
  public ListWithAggregateOperations(int element1, int element2)
  {
    _element1 = element1;
    _element2 = element2;
  }

  public int SumOfElements()
  {
    return _element1 + _element2; //changed
  }
}

Now the variability in the design becomes obvious - it's the number of elements added! So now that we have two examples, we see that we have redundant constructors and redundant fields for each element in the list and if we added a third spec for three elements, we'd have to add another constructor, another field and another element of the sum computation. Time to generalize!

How do we encapsulate the variability of the element count so that we can get rid of this redundancy? A collection! How do we generalize the addition of multiple elements? A foreach loop through the collection!

First, let's start with writing a new, more general unit spec to showcase the new desired design (the existing two specs will remain in the code for now):

[Test]
public void 
ShouldReturnSumOfAllItsElementsWhenAskedForAggregateSum()
{
  //GIVEN
  var firstElement = Any.Integer();
  var secondElement = Any.Integer();
  var thirdElement = Any.Integer();
  var listWithAggregateOperations 
    = new ListWithAggregateOperations(new List<int>
      { firstElement, secondElement, thirdElement});

  //WHEN
  var result = listWithAggregateOperations.SumOfElements();

  //THEN
  Assert.AreEqual(firstElement + secondElement + thirdElement, result);
}

Note that we have introduced a constructor taking a list of arbitrary number of elements instead of just the values. Time to accommodate it in the design and bring the generalization we have just introduced in our spec into the implementation:

public class ListWithAggregateOperations
{
  List<int> _elements = new List<int>();

  public ListWithAggregateOperations(int element)
    : this(new List<int>() { element }) //changed
  {
  }

  public ListWithAggregateOperations(int element1, int element2)
    : this(new List<int>() { element1, element2 }) //changed
  {
  }

  //added
  public ListWithAggregateOperations(List<int> elements)
  {
    _elements = elements;
  }

  public int SumOfElements()
  {
    //changed
    int sum = 0;
    foreach(var element in _elements)
    {
      sum += element;
    }
    return sum;
  }
}

As we can see, the design is more general and I made the two existing constructors as a simple delegation to a new, more general one. Also, the first two specs ("one element" and "two elements") still pass with the new, more general implementation under the hood, meaning that we didn't break any existing behavior. Thus, it's now safe to remove those two specs, leaving only the most general one. Also, we can remove the redundant constructors, leaving the implementation like this:

public class ListWithAggregateOperations
{
  List<int> _elements = new List<int>();

  public ListWithAggregateOperations(List<int> elements)
  {
    _elements = elements;
  }

  public int SumOfElements()
  {
    int sum = 0;
    foreach(var element in _elements)
    {
      sum += element;
    }
    return sum;
  }
}

And voilà! We have arrived at the final, generic solution. Note that the steps we took were tiny - so you might get the impression that the effort was not worth it. Indeed, this example was only to show the mechanics of triangulation - in real life, if we encountered such simple situation we'd know straight away what the design would be and we'd start with the general specs straight away and just type in the obvious implementation. Triangulation shows its power in more complex problems with multiple design axes and where taking tiny steps helps avoid "analysis paralysis".

Principles

These are the main principles of Triangulation:
  1. Start writing specs from the simplest and most obvious case, increasing the complexity of the specs and the generality of implemntation as you add more specs.
  2. Generalize only when you have at least two examples that show you which axis of design change needs to be generalized.
  3. After arriving at the correct design, remove the redundant specs (remember, we want to have one spec failing for single reason)

Related Concepts

Inside-out development

Triangulation is a core part of inside-out TDD style, where one uses mocks sparingly and focuses on getting the lower-layer (i.e. the assumptions) right before developing a more concrete solution on top of it.

Can triangulation be used as a part of outside-in development (as described last time)? Of course, although it's probably used less often. Still, when you have a piece of functionality with well-defined inputs and outputs but don't know what the design behind it could be, you can go ahead and use triangulation whether you're developing outside-in or inside-out.

Acceptance tests/specifications

Even when doing Need-Driven Development using mocks etc., triangulation can be very useful at the acceptance level, where you can try to derive the internal design of whole module based on the tests that convey your understanding of domain rules.

Applicability

As I stated before, triangulation is most useful when you have no idea how the internal design of a piece of functionality will look like (e.g. even if there are work-flows, they cannot be easily derived from your knowledge of the domain) and it's not obvious along which axes your design must provide generality, but you are able to give some examples of the observable behavior of that functionality given certain inputs. These are usually situations where you need to slow down and take tiny steps that slowly bring you closer to the right design - and that's what triangulation is for!

Personally, I find triangulation useful when test-driving non-trivial data structures.

2 comments:

Anonymous said...

Nice. Thanks a lot.

Anonymous said...

When I read TDD By Example, at first I thought triangulation was about coming to the middle of two assertion statements. This article clarified my mistake. Thank you for that.