Thursday, 4 July 2013

How to start with a test, not implementation - part 3

(this post is adapted from my work-in-progress open source TDD tutorial)

(Note that I use "Statement" instead of "test" and "Specification" instead of "Test Suite" in this post)

In the first part, I discussed how a good name is a great start for writing a Statement when there's no production code to invoke. In the second part, I elaborated on usefulness of thinking in terms of GIVEN-WHEN-THEN structure and translating it almost literally to code. Today, I'd like to introduce to you another technique - one that may appear awkward, but is actually very useful.

Start from the end

This is a technique that I suggest to people that seem to have absolutely no idea how to start. I got it from Kent Beck's book Test Driven Development by Example. It seems funny at start, but is quite powerful. The trick is to write the Statement 'backwards', i.e. starting with what the Statement asserts to be true (in terms of the GIVEN-WHEN-THEN structure, we'd say that we start with our THEN).

This works because, while we're many times quite sure of our goal (i.e. what the outcome of the behavior should be), but are unsure of how to get there.

A simple example

Imagine we're writing a class for granting access to a reporting functionality based on roles. We don't have any idea what the API should look like and how to write our Statement, but we know one thing: in our domain the access can be either granted or denied. Let's take the successful case (just because it's the first one we can think of) and, starting backwards, start with the following assertion:


Ok, that part was easy, but did we make any progress with that? Of course we did - we now have a non-compiling code and the compilation error is because of the accessGranted variable. Now, in contrast to the previous approach (with translating our GIVEN-WHEN-THEN structure into a Statement), our goal is not to make this compile as soon as possible. The goal is to answer ourselves a question: how do I know whether the access is granted or not? The answer: it is the result of authorization of the allowed role. Ok, so let's just write it down, ignoring everything that stands in our way (I know that most of us have a habit to add a class or a variable as soon as we find out that we need it. If you're like that, then please turn off this habit while writing Statements - it will only throw you off the track and steal your focus from what's important. The key to doing TDD successfully is to learn to use something that does not exist yet like it existed):

var accessGranted 
 = authorization.IsAccessToReportingGrantedTo(

Note that we do not know what roleAllowedToUseReporting is, neither do we know what's authorization, but that didn't stop us from writing this line. Also, the IsAccessToReportingGrantedTo() method is just taken from the top of our head. It's not defined anywhere, it just made sense to write it like this, because it's the most direct translation of what we had in mind.

Anyway, this new line answers the question on where do we take the accessGranted from, but makes us ask further questions:

  1. Where does the authorization variable come from?
  2. Where does the roleAllowedToUseReporting variable come from?

As for authorization, we don't have anything specific to say about it other than that it is an object of a class that we don't have yet. In order to proceed, let's pretend that we have such a class. How do we call it? The instance name is authorization, so it's quite straightforward to name the class Authorization and instantiate it in the simplest way we can think of:

var authorization = new Authorization();

Now for the roleAllowedToUseReporting. The first question that comes to mind when looking at this is: which roles are allowed to use reporting? Let's assume that in our domain, this is either an Administrator or an Auditor. Thus, we know what's going to be the value of this variable. As for the type, there are various ways we can model a role, but the most obvious one for a type that has few possible values is an enum. So:

var roleAllowedToUseReporting = Any.Of(Roles.Admin, Roles.Auditor);

And so, working our way backwards, we have arrived at the final solution (we still need to give this Statement a name, but, provided what we already know, this is an easy task):

[Fact] public void
 var roleAllowedToUseReporting = Any.Of(Roles.Admin, Roles.Auditor);
 var authorization = new Authorization();

 var accessGranted = authorization



Vladimir said...


I have some thoughts about final solution.

This test heavily depends on how Any.Of is implemented.

Let's suppose it always returns Roles.Admin. Then I can write implementation bool IsAccessToReportingGrantedTo(Roles role) { return roles == Roles.Admin; }

Test will pass, and nothing will say me, that implementation is wrong in case of Roles.Auditor.

If Any.Of returns different value randomly, then we have test, that sometimes passes, sometimes fails.

Grzegorz Gałęzowski said...

Hi, Vladimir,

1. If Any.Of() returned the same value, you could write the following implementation:

bool IsAccessToReportingGrantedTo(Roles role) { return roles == Roles.Admin; }

and you would have a false positive.

If you think about it, however, this is actually no worse than picking one representative value.

Also, the thing about Any is not how it chooses the value, but rather that _you_do_not_choose_it. This way, it may lead you to writing proper implementation as a response to the test with less triangulation, because you can't assume what the value is gonna be. Even if today it's the same value every time, it's hard to assume it will be the same tomorrow (e.g. when next version of Any is released or a new test is added to the suite).

You could argue that a better alternative would be to test for both values using a data-driven test (Theory( It may be a valid alternative, but with two assumptions:
1) The input range is small. Let's say that you had an integer instead of a role. It would be hard then to test for all possible values. Enum is usually a smaller integer range.

2) The input range is stable.
If you commit to to testing for all "positive" values, the same reason would make you test for all "negative" values in another test. This means that every time the enum gains a new "negative" value, you have to remember to go back and add another test (or a data to existing data-driven test) to keep the test suite comprehensive. The question is: where do you stop?

2. If Any was implemented with random, it would actually also be the case as in #1, because you get a really bad random distribution with two values input range :-D.

The usual practice is to make Any yield next value every time in a cycle, e.g.

Admin, Auditor, Admin, Auditor etc.

this may make you uneasy about determinism when you have more and more complex conditions based on the enum values in the tested code, but isn't this a case where one would want to refactor enum switch to polymorphism?

So the bottom line is that I acknowledge that this approach may have some weaknesses and I don't discard other possible techniques (such as triangulation), however, it brings enough value for me to adopt it.

Please let me know what your solution would be. Would you test for all values, chose representatives or mix strategies depending on whether you are dealing with "positive" or "negative" case?

Vladimir said...

I try (but not always succeed) to write such test suite, that if I give only tests to some another developer and say him: "create the simplest implementation, that makes these tests pass", it should result in correct program.

By "simplest" I mean writing code only driven by failed test. If test is passed, then you shouldn't write anything else. (Recently I found out, that different people assume different things, what "simple code" means for them!)

So in case in two roles I write two tests (or foreach loop, or data driven test, if accepted unit testing framework supports them), because simpler (and wrong) implementation is possible to make test pass otherwise.

In case of integer numbers, I'll test cases, that differ in execution strategies. If code does something wrong with number not in tested set, then code is more complicated, than necessary (and with unit tests we never can check, that code doesn't do anything wrong on top of what it should do).

Grzegorz Gałęzowski said...

Thanks for sharing your strategy!

I think we share the same goals, just our solutions are a bit different. My impression is that I apply the same strategy to enums as you apply for integers. Only that I pick the values using "constrained non-determinism".

The constrained non-determinism is my way of reaching the goal we seem to share: that when I say "create the simplest implementation, that makes these tests pass", it should result in correct program. Only my hope is that when such a developer would see Any.Of(Roles.Admin, Roles.Auditor), they would not try their luck hacking around the constrained non-deterministic mechanism, but rather just write the proper implementation.

Thanks for your comments, I like them because they challenge my long-held assumptions and make me think harder :-)

Grzegorz Gałęzowski said...

Oh, by the way, I don't always use constrained non-determinism. An example when I don't is around numerical boundaries. Then I apply the approach from point 3 (Behavior with a boundary) from:

Vladimir said...

Thanks for your answers and this blog. Seems like it can illustrate almost all questions I have.

Only my hope is that when such a developer would see Any.Of(Roles.Admin, Roles.Auditor), they would not try their luck hacking around the constrained non-deterministic mechanism, but rather just write the proper implementation.

I think, I have a good counter-example.

Let's have a change request, we have a new role Supervisor, that IsAccessToReportingGrantedTo should be true also.

We change test to: var roleAllowedToUseReporting = Any.Of(Roles.Admin, Roles.Auditor, Role.Supervisor);

Compile, execute tests, all green, commit.

What is wrong? We didn't change any production code! But changes in code should be driven only by failed test, and we didn't have any. And if we changed it, then it's just because we are good enought to not forget about it (and I can easily forget about this things, or when copy-paste comes into play).

Vladimir said...

And if developer should look at Any.Of, realise, that it means any value from enum there and add code - I really feel about it like writing code from paper specification, you never can really answer, does code indeed follow this specification or not.

And he can just forget to add code, and nothing will catch his hand (if programmer never does mistakes, he doesn't need any tests).

Grzegorz Gałęzowski said...

Vladimir, great counter-example, thanks for that! What you highlighted is the actual difference between integers and enums - it is more natural to use range operators (<, > etc.) with integers (i.e. they represent a contimuum), while in enums, each value represents something distinct.

I thought for a while how I approach this kind of situations and for now I think I would do something like this:

When I add another role to the test, I run all the tests and expect to have a failing test as per TDD cycle.

As you said, it is possible that this does not happen - if this is so, then:

1. Either I add the code to the implementation anyway out of fear that this may break in the future

2. Or I place a new enum value in a position where it will fail the test (i.e. the value is picked in that test).
3. Or I change the test for a minute to a data-driven one just to get red bar and then after green bar return to Any.Of().
4. Or I permanently change the test to data-driven one. This is still an option if I don't feel confident enough
5. Or I try if I can refactor the enum-based if statement to polymorphism. Then, I would have a factory that creates distinct role object for each enum value, e.g.

Assert.InstanceOf( factory.create(Roles.Admin));

and then it makes perfect sense to specify each case separately. Of course, it heavily depends on the context whether this move makes sense - It would take more than a need to answer one simple question to justify introducing role classes - each class would need to have different behaviors, not only different data.

Anyway, this is an interesting example and I will watch myself how I react next time I see a situation like this. Thanks!

Grzegorz Gałęzowski said...

Sorry, the code should have been:

Assert.InstanceOf<Admin>( factory.create(Roles.Admin));