Wednesday, 27 March 2013

Implementation Readability vs Design Readability and what TDD has to do with it?

Many thanks to Łukasz Bigos for discussion and inspiration.

Recently I had a discussion with few of my colleagues on readability of Test-driven (or more precisely - loosely coupled) code. The thing that surprised me the most was the argument that loosely coupled code is unreadable. The surprise comes from my deep conviction that TDD help achieve extremely readable code.

In my mind, TDD leads to well factored, small classes with single responsibilities, but in some others' minds, it's just "obfuscating the picture" and they "don't know what's happening in this code because everything is hidden and split across many classes and methods".

So I thought about it some more, talked a little more and finally I reached a conclusion on why there's a disagreement between us.

Two ways of understanding readability

It appears that, depending on our habits, "readable" means something different to us. There are two kinds of readability I learned to differentiate: implementation readability and design readability. Let's try to look at what each one is about and who & why finds this particular kind of readability relevant.

1. Implementation readability

Let's imagine we're developing client-server application for project management. The part we'll be dealing with is the server-side processing of a request to add new issue to existing project. By looking at the example code, we'll see how and why implementation readability plays out in practice. Without further ado, ladies and gentlemen, the code:

public void AddNewIssueToProject(Request request, Response response)
{
  try
  {
    if(!request.ContainsKey("senderIp") 
       || string.IsNullOrWhiteSpace(request["senderIp"] as string))
    {
      response.Write("Wrong sender IP");
      return;
    }

    if(!request.ContainsKey("projectId") 
       ||  string.IsNullOrWhiteSpace(request["projectId"] as string))
    {
      response.Write("Invalid project ID");
      return;
    }

    if(!request.ContainsKey("issueData") || 
      request["issueData"] as Dictionary<string, string> == null)
    {
      response.Write("issue data not passed");
      return;
    }

    var issueData 
      = request["issueData"] as Dictionary<string, string>;

    if(!issueData.ContainsKey("Title") || 
        string.IsNullOrWhiteSpace(issueData["Title"]))
    {
      response.Write("No title for issue supplied. " +
        "Issue title must be at least one readable character");
      return;
    }

    if(!issueData.ContainsKey("Content") || 
        string.IsNullOrWhiteSpace(issueData["Content"]))
    {
      response.Write("No content for issue supplied. " +
        "By our policy, every issue must be described with details");
      return;
    }

    if(!issueData.ContainsKey("Severity") 
       || string.IsNullOrWhiteSpace(issueData["Severity"]))
    {
      response.Write("No severity supplied, " + 
        "although the form forces this." +
        " Are you trying to send us raw HTTP request?");
      return;
    }

    var projectId = request["projectId"] as string;
    var issueTitle = issueData["Title"];
    var issueContent = issueData["Content"];
    var severity = issueData["Severity"];

    var queryString = string.Format(
      "INSERT INTO ProjectIssues VALUES ({0}, '{1}', '{2}', '{3}')",
      projectId, issueTitle, issueContent, severity);

    var query = new Query(queryString);
    using(var connection = DatabaseConnection.Open("dbName=Abcdef"))
    {
      query.ExecuteThrough(connection);
    }

    response.Write("Everything's OK!");
  }
  catch(Exception e)
  {
    response.Write("Unexpected error, see log for details");
    _log.Error(e);
  }
}

While this snippet may seem like a classic spaghetti code to those of you who are more design-infected, there is actually a significant benefit behind it. The benefit is that without looking anywhere else, we can deduce the state of the method variables in every line and what precisely will happen in the system when we run this code. Because the method is made up mostly of primitive or library constructs, we're able to "debug" it with our eyes only, without jumping here and there to gather the parts of the functionality necessary to "understand" the code.

Wait, did I just write "understand"? What kind of "understanding" are we talking about here? Well let's take the following line as example:

    var projectId = request["projectId"] as string;

When reading this code, by the time we arrive at this line we already know that some values are put inside the dictionary and that they're non-null and not some other value like empty string. What are "dictionary", "null" and "empty string"? They're implementation details! Are they connected to the domain of request processing? No. Do they help describe the high-level work-flow? No. Do they help us understand the requirements? Not much. Can we read from the code what steps does this method consists of or where each step begins and where it ends? No, we can extract it somehow by comparing the code to our domain knowledge or experience (e.g. most of us know that each request coming to the server has to be validated), but again, this is something extracted from the code, not something that's already there.

So, the conclusion is that this style of writing is better at describing how the code works, while doing a poor job of describing what the code is for.

Benefits of implementation readability

So, who benefits from having a high implementation readability? Well, there are times when we're given some assignments in a code we don't know. Let's say that our boss told us to fix a bug in an application that should write string "abcd" to Windows registry, but it writes "abc". Now, we don't care about the big picture - all we care for is that we know what the single external behavior is and what it should be instead (we don't even care why), so we search for this one line in the code that is responsible for the behavior to replace it with something else. While searching, everything that is not the sought construct (including design and domain knowledge), is in our way. From this point of view, the ideal situation would be to have the whole application in a single source file so that we can use our text editor to search it for the word "Registry", then examine each occurrence and make the fix. In other words, we're acting as a little smarter "search and replace" mechanisms. The single-responsibility classes just get in our way and make us "pointlessly" navigate through a web of objects we neither want nor need to understand (some would say that we're on level 1 or 2 of Dreyfus model of skill acquisition).

While cases such as this one happen, we must realize that they're not the cases we're trying to optimize for. In any but the simplest cases, making a novice go into a piece of code to make changes without understanding of domain or design will lead to degradation of the code and probably few bugs (when someone is not aware of the big picture, they might fix a bug in a way that introduces another bug somewhere else, because of another scenario they didn't have any idea of). So the case we want to optimize for is someone going into the code with a need to understand the domain and the design first before making a change. It's best if they can derive at least part of this knowledge from the code itself and easily map parts of domain knowledge to places in the code. And this is how we arrive at...

2. Design readability

Part of good object-oriented design is implementation hiding, i.e. we want to hide how a particular piece of functionality is implemented so that we can change it later. Also, when we're equipped with domain and design knowledge, we want this knowledge stand out, not to be obfuscated by implementation details. To give you a quick example of what I mean: when talking about web server session storage, we say that "a user is assigned a persistent session", not that "a database holds a serialized hashtable indexed by user ID". Thus, we want to see the first clearly visible in the code, not the latter. Otherwise, readability is hurt.

Let's now take a little piece of refactored request handling code that we began with and see whether we can see any improvements:

class AddingNewIssueToProject : Scenario
{
    UserInteraction _user;
    AddIssueToProjectRequest _addIssueToProjectRequest;
    PersistentStorage _issueStorage;

    public AddingNewIssueToProject (
      AddIssueToProjectRequest requestForAddingIssueToProject, 
      UserInteraction userInteraction,
      PersistentStorage issueStorage)
    {
      this._requestForAddingIssueToProject 
        = requestForAddingIssueToProject;
      this._user = userInteraction;
      this._issueStorage = issueStorage;
    }

    public void GoThrough()
    {
      try
      {
        _requestForAddingIssueToProject.Validate();
        Issue issue = _requestForAddingIssueToProject.ExtractIssue();
        issue.StoreInside(_issueStorage);
        issue.ReportSuccessfulAdditionTo(_user);
      }
      catch(ScenarioFailedException scenarioFailed)
      {
        _user.NotifyThat(scenarioFailed);
      }
   }
}

As you can see, there's almost no "implementation" here. No strings, no nulls, no dictionaries, lists, data structures, no addition, subtraction, multiplication etc. So how can it be readable as it tells very little of the external application behavior? Count with me:

  1. The class name is AddingNewIssueToProject, which says that this class is all about this single scenario (in other words, when something is wrong with different scenario, this is not the place to look for root cause). Moreover, it implements a Scenario interface, which gives us a clue that each scenario in the application is modeled as object and we can most probably find all scenarios supported by the application by searching for classes implementing Scenario interface.
  2. It depends on three interfaces: UserInteraction, AddIssueToProjectRequest and PersistentStorage, which suggest that the scope of the scenario is: communication with user, working with the user-entered data and saving data to persistent storage.
  3. The GoThrough() method contains the sequence of steps. We can clearly see which steps the scenario consists of and what is their order, e.g. looking at this method, we are sure that we're not trying to save invalid Issue object, since storing issue is after validating correctness.
  4. Looking at the invocation of ExtractIssue() method, we can see that the new issue that's being saved is made of data entered by user.
  5. The object _issueStorage is of type PersistentStorage, suggesting that, after the issue is saved, we'll be able to retrieve it later, most probably even after the server restart.
  6. Each of the steps of the scenario is allowed to fail (by throwing ScenarioFailedException) and the user is immediately notified of the failure.
  7. If the adding issue scenario ever needs to perform any additional step in the future, this is the place where we'd start adding this change.

Not bad for just few lines of code that contain practically no implementation details, huh?

Which one to prefer?

Personally, I strongly prefer design readability, i.e. I want to draw as much domain knowledge and design constraints knowledge from the code (not to be mistaken with overdesign) as possible. One reason for this is that design readability makes it much easier to introduce changes that are aligned with the "spirit" of the code that's already in place and such changes are usually much more safe than hacking around. Another reason is that implementation readability is impossible to maintain over the long haul. Let's take the following snippet from our example:

if(!issueData.ContainsKey("Severity") 
   || string.IsNullOrWhiteSpace(issueData["Severity"]))
{
  response.Write("No severity supplied, " + 
    "although the form forces this." +
    " Are you trying to send us raw HTTP request?");
  return;
}

This is a part of code responsible for adding new issue, but we can imagine it making its way to scenario for editing existing issue as well. When this happens, we have two choices:

  1. Copy-paste the code into the second method, creating duplication and redundancy (and leading to situations where one change needs to be made in more than one place, which, according to Shalloway's Law is asking for trouble)
  2. Extracting the code at least to a separate method that already defeats implementation readability partially, because each time you need to understand what piece of code does, you need to understand an additional method. All you can do is to give this method an intent-revealing name. Anyway, introducing new methods just to remove redundancy leaves us with with code where sections of implementation details are mixed with sections of domain work-flows stated in terms of calling domain-specific methods on domain-specific objects. From such code, we can easily deduce neither full work-flows, nor all implementation details. Such code is better known under the name of Spaghetti Code :-),

An object oriented system is not a set of streams of primitive instructions. It is a web of collaborating objects. The more complex the system, the more this metaphor becomes visible. That's why, instead of trying to put all implementation in one place to make it "debuggable", it's better to create a cohesive and readable description of that collaboration. A description that reveals domain rules, domain workflows and design constraints.

Of course, the choice is yours. Until you decide to do TDD, that is. That's because TDD strongly encourages clean design and separation of domain and design from implementation details. If good design guidelines are not followed, tests get longer and longer, slower and slower, more and more test logic is repeated throughout the tests etc. If that's where you are now, better revise your approach to design!

What do you think?

I'm dying to know what your opinions and observations on these topics are. Do you find the distinction between implementation and design readability useful? Which one you prefer? Please let me know in the comments!

Monday, 4 March 2013

The Two Main Techniques in Test-Driven development, part 2

Today, I'd like to continue where I left last time. As you might remember, the first post was about Need-Driven Development and this one's gonna be about a technique that's actually older and it's called Triangulation.

Triangulation

History

The first occurence of the term triangulation I know about is in Kent Beck's book Test Driven Development: By Example where Kent describes it as the most conservative technique of test-driving the implementation. It's one of the three core techniques of classic TDD.

Description

The three approaches to test-driving the implementation and design described in Kent's book are:

  1. Write the obvious implementation
  2. Fake it ('till you make it)
  3. Triangulate

Kent describes triangulation as the most conservative technique, because following it involves the tiniest possible steps to arrive at the right solution. The technique is called triangulation by analogy to radar triangulation where outputs from at least two radars must by used to determine the position of a unit. Also, in radar triangulation, the position is measured indirectly, by combining the following data: range (not position!) measurement done by the radar and the radar's own position (which we know, because we put the radar there).

These two characteristics: indirect measurement and using at least two sources of information are at the core of TDD triangulation. Basically, it says:

  1. Indirect measurement: Derive the design from few known examples of its desired external behavior by looking at what varies in these examples and making this variability into something more general
  2. Using at least two sources of information: start with the simplest possible implementation and make it more general only when you have two or more examples

Triangulation is so characteristic to the classic TDD that many novices mistakenly believe TDD is all about triangulation.

Example

Suppose we want to write a method that creates an aggregate sum of the list. Let's assume that we have no idea how to design the internals of our custom list class so that it fulfills its responsibility. Thus, we start with the simplest example:

[Test]
public void 
ShouldReturnTheSameElementAsASumOfSingleElement()
{
  //GIVEN
  var singleElement = Any.Integer();
  var listWithAggregateOperations 
    = new ListWithAggregateOperations(singleElement);

  //WHEN
  var result = listWithAggregateOperations.SumOfElements();

  //THEN
  Assert.AreEqual(singleElement, result);
}

The naive implementation can be as follows:

public class ListWithAggregateOperations
{
  int _element;

  public ListWithAggregateOperations(int element)
  {
    _element = element;
  }

  public int SumOfElements()
  {
    return _element;
  }
}

It's too early to generalize yet, as we don't have another example. Let's add it then. What would be the next more complex one? Two elements instead of one. As we can see, we're beginning to discover what's variable in our design. The second spec looks like this:

[Test]
public void 
ShouldReturnSumOfTwoElementsAsASumWhenTwoElementsAreSupplied()
{
  //GIVEN
  var firstElement = Any.Integer();
  var secondElement = Any.Integer();
  var listWithAggregateOperations 
    = new ListWithAggregateOperations(firstElement, secondElement);

  //WHEN
  var result = listWithAggregateOperations.SumOfElements();

  //THEN
  Assert.AreEqual(firstElement + secondElement, result);
}

And the naive implementation will look like this:

public class ListWithAggregateOperations
{
  int _element1 = 0;
  int _element2 = 0;

  public ListWithAggregateOperations(int element)
  {
    _element1 = element;
  }

  //added
  public ListWithAggregateOperations(int element1, int element2)
  {
    _element1 = element1;
    _element2 = element2;
  }

  public int SumOfElements()
  {
    return _element1 + _element2; //changed
  }
}

Now the variability in the design becomes obvious - it's the number of elements added! So now that we have two examples, we see that we have redundant constructors and redundant fields for each element in the list and if we added a third spec for three elements, we'd have to add another constructor, another field and another element of the sum computation. Time to generalize!

How do we encapsulate the variability of the element count so that we can get rid of this redundancy? A collection! How do we generalize the addition of multiple elements? A foreach loop through the collection!

First, let's start with writing a new, more general unit spec to showcase the new desired design (the existing two specs will remain in the code for now):

[Test]
public void 
ShouldReturnSumOfAllItsElementsWhenAskedForAggregateSum()
{
  //GIVEN
  var firstElement = Any.Integer();
  var secondElement = Any.Integer();
  var thirdElement = Any.Integer();
  var listWithAggregateOperations 
    = new ListWithAggregateOperations(new List<int>
      { firstElement, secondElement, thirdElement});

  //WHEN
  var result = listWithAggregateOperations.SumOfElements();

  //THEN
  Assert.AreEqual(firstElement + secondElement + thirdElement, result);
}

Note that we have introduced a constructor taking a list of arbitrary number of elements instead of just the values. Time to accommodate it in the design and bring the generalization we have just introduced in our spec into the implementation:

public class ListWithAggregateOperations
{
  List<int> _elements = new List<int>();

  public ListWithAggregateOperations(int element)
    : this(new List<int>() { element }) //changed
  {
  }

  public ListWithAggregateOperations(int element1, int element2)
    : this(new List<int>() { element1, element2 }) //changed
  {
  }

  //added
  public ListWithAggregateOperations(List<int> elements)
  {
    _elements = elements;
  }

  public int SumOfElements()
  {
    //changed
    int sum = 0;
    foreach(var element in _elements)
    {
      sum += element;
    }
    return sum;
  }
}

As we can see, the design is more general and I made the two existing constructors as a simple delegation to a new, more general one. Also, the first two specs ("one element" and "two elements") still pass with the new, more general implementation under the hood, meaning that we didn't break any existing behavior. Thus, it's now safe to remove those two specs, leaving only the most general one. Also, we can remove the redundant constructors, leaving the implementation like this:

public class ListWithAggregateOperations
{
  List<int> _elements = new List<int>();

  public ListWithAggregateOperations(List<int> elements)
  {
    _elements = elements;
  }

  public int SumOfElements()
  {
    int sum = 0;
    foreach(var element in _elements)
    {
      sum += element;
    }
    return sum;
  }
}

And voilà! We have arrived at the final, generic solution. Note that the steps we took were tiny - so you might get the impression that the effort was not worth it. Indeed, this example was only to show the mechanics of triangulation - in real life, if we encountered such simple situation we'd know straight away what the design would be and we'd start with the general specs straight away and just type in the obvious implementation. Triangulation shows its power in more complex problems with multiple design axes and where taking tiny steps helps avoid "analysis paralysis".

Principles

These are the main principles of Triangulation:
  1. Start writing specs from the simplest and most obvious case, increasing the complexity of the specs and the generality of implemntation as you add more specs.
  2. Generalize only when you have at least two examples that show you which axis of design change needs to be generalized.
  3. After arriving at the correct design, remove the redundant specs (remember, we want to have one spec failing for single reason)

Related Concepts

Inside-out development

Triangulation is a core part of inside-out TDD style, where one uses mocks sparingly and focuses on getting the lower-layer (i.e. the assumptions) right before developing a more concrete solution on top of it.

Can triangulation be used as a part of outside-in development (as described last time)? Of course, although it's probably used less often. Still, when you have a piece of functionality with well-defined inputs and outputs but don't know what the design behind it could be, you can go ahead and use triangulation whether you're developing outside-in or inside-out.

Acceptance tests/specifications

Even when doing Need-Driven Development using mocks etc., triangulation can be very useful at the acceptance level, where you can try to derive the internal design of whole module based on the tests that convey your understanding of domain rules.

Applicability

As I stated before, triangulation is most useful when you have no idea how the internal design of a piece of functionality will look like (e.g. even if there are work-flows, they cannot be easily derived from your knowledge of the domain) and it's not obvious along which axes your design must provide generality, but you are able to give some examples of the observable behavior of that functionality given certain inputs. These are usually situations where you need to slow down and take tiny steps that slowly bring you closer to the right design - and that's what triangulation is for!

Personally, I find triangulation useful when test-driving non-trivial data structures.

Thursday, 21 February 2013

The Two Main Techniques in Test-Driven development, part 1

Today, I'd like to write a bit about different techniques for test-driving development. There are actually two of them which I'd like to discuss: Need-Driven Development and Triangulation. Part one is about NDD, Triangulation shall be covered in part two.

Need-Driven Development

History

Need-Driven Development is a term that's less commonly used nowadays. It was coined in a paper Mock Roles Not Objects, but has since been somewhat abandoned. The authors of the mentioned paper wrote a book later on and the term Need-Driven Development is missing from it. Anyway, the approach itself has not changed and is by far one of the most popular.

Description

Imagine you're building a plane and the requirement for it is to transport a guy named John from Poland to US. Imagine you start with computer simulation (in a fake context) of putting a guy in the air - he would fall, the simulation says. Ok, so we need to put him on something - we introduce floor. But the floor would fall as well. So we need something to hold it in the air - we introduce engine. But we need to keep the balance in the air somehow - we introduce wings. And so on, and so on. This is the philosophy of NDD - start with the main thing you need without having anything and then figure out what else you need to do this and the figure out what this "else" needs...

NDD is an outside-in approach to development using unit tests and mock objects that stand for the context which does not yet exist. This way, we start with what our current concrete object needs and shape its context (which is filled in by mock objects) based on those needs. So in this approach, mocks are used as a tool for discovering interactions - we're free to shape the mocked interface anyway we like so that they're best suited to interact with the concrete object we're dealing with now.

As the fathers of NDD put it:

We use Mock Objects to let us write the code under test as if it had everything it needs from its environment. This process shows us what an object’s environment should be so we can then provide it.

Example

Let's take as an example the case from my previous post, which specified a simplified logic for logging in:

  1. When supplied credentials are valid, the Log-in process shall setup response as positive and redirect me to the home page
  2. When supplied credentials are invalid, the log-in process shall setup the response as negative and redirect me to the login page.
[Test]
public void 
ShouldSetResponseAsPositiveAndRedirectToHomePageWhenPerformedWithValidCredentials()
{
  //GIVEN
  var response = Substitute.For<Response>();
  var validCredentials = Substitute.For<UserCredentials>();
  var homePage = Substitute.For<Page>();
  var loginPage = Substitute.For<Page>();
  var loggingIn = new LoggingIn(homePage, loginPage);
  validCredentials.AreValid().Returns(true);

  //WHEN
  loggingIn.PerformUsing(validCredentials);

  //THEN
  response.Received().SetUpAsSuccessful();
  homePage.Received().RedirectTo();
}

The specification shows us that from the perspective of LoggingIn class, it's most comfortable to describe its behavior in the context of collaborators like Response, Page and Credentials that provide certain services in the form of methods. Thus, we have discovered three new abstractions (along with some methods they need to provide), that don't need to have concrete implementations in order to run specifications for LoggingIn class - we fill these abstractions with mocks, so that specs compile and run. When we're finished describing the LoggingIn object, we can proceed with describing real implementations for all the discovered abstractions the same way we did for LoggingIn class (by the way, the specifications for LoggingIn are always based on mocks to provide behavior isolation - we never change mocks to real objects in unit specs).

Principles

These are the main principles of Need-Driven Development:

  1. Start the development from the outermost layer (i.e. the inputs) and dig deeper and deeper, layer by layer, reaching the boundaries of the application
  2. Derive how interfaces should look like from what their clients need them to provide
  3. Derive not only interfaces (method signatures), but also protocols (e.g. which behaviors of discovered abstractions are expected by the class in terms of which we discovered them and how they should be handled)
  4. Think of the application as a web of collaborating objects sending messages to each other
  5. Mock only the types you own - mocks are primarily design tools, not isolation tools (although the latter is still somewhat significant)
  6. Avoid getters, try to follow "Tell, don't ask" advice most of the time
  7. Each test creates a need for either:
    • new implementation in the tested object
    • new collaborator to appear
    • new way of communicating with existing collaborator to appear (e.g. calling different method on a collaborator or a collaborator returning different result from already existing method)
  8. Distinguish between Value Types and "objects". Value types are merely a more intelligent and domain-oriented data, while "objects" are service providers.

Related Concepts

Lean

The need-driven development is closely related to Lean principles for pulling value from demand rather than pushing it from implementation. Thus, following the NDD leads to writing only the necessary code and avoiding adding features that aren't needed. This is because the only code that gets written is the one directly needed by code that already exists. The need for the first production code to exist are the acceptance tests, and those come directly from the customer. This way, NDD fits well into a cohesive value stream, starting with the customer and ending on the code itself. Thus, NDD is well aligned with lean principles.

"Needs interfaces" and "capabilities interfaces"

Another related concept is that of "Needs vs capabilities in interfaces", coming from the book Essential Skills for the Agile Developer. One important lesson from it is to always design interfaces (e.g. method signatures) according to client needs and what the client would use. After the second client is introduced that needs a different interface to the same services, the way to go is to provide it with its own Needs Interface and extracting the common logic in so called Capabilities Interface that both of these Need Interfaces use. At that time, these Need Interfaces most often become mere adapters or facades to the more general interface. The lesson from the authors of Essential Skills for the Agile Developers is compatible with the NDD philosophy, guiding us in our design and refactoring attempts.

Programming by Intention

If you remember my post about Programming By Intention and how it compares to TDD, then you probably already noticed the similarities between Programming By Intention and NDD. As a matter of fact, when I mentioned TDD in that post, I had NDD in mind.

Applicability

The technique is most useful when we have well-defined acceptance criteria and usually comes into play after creating and executing a failing acceptance test that defines the next increment of functionality we need to provide. Also, it's easier to apply when we know the high-level work-flows the application needs to support (usually the case of enterprise systems). Then we discover lower-level work-flows that result from the higher-level work-flows and, layer by layer, we reach the boundaries of the system.

Monday, 28 January 2013

Single method perspective vs single behavior perspective

Hi, today, I'd like to outline few differences between a technique called Programming By Intention and Test-Driven Development. I'm currently halfway through Essential Skills For The Agile Developer and the chapter on Programming By Intention made me revisit this fine technique and think it over again, especially in comparison to Test-Driven Development.

What's Programming By Intention?

On the book's page you can get the chapter on Programming By Intention for free and read it, but for those of you that like summaries - it's an outside-in approach to writing methods, that makes a distinction between so called "Sergeant" methods and "Private" methods. A "Sergeant" is a method that has the single responsibility of describing a high-level workflow and it delegates all the implementation details to "Private" methods. When beginning to write a piece of code, you start with a "Sergeant" and pretend that all the methods it needs to call already exist (although they usually don't). A simplified example of a "sergeant" for logging-in logic would be something like this:

public string LogIn(string userName, string password)
{
  string homePage = GetHomePageAddress();
  string loginPage = GetLoginPageAddress();
  string response = GetDefaultResponse();

  if(AreCredentialsValid(userName, password))
  {
    response = GetResponseWithSuccessfulLoginMessage();
    RedirectTo(homePage);
  }
  else
  {
    response = GetResponseWithLoginErrorsFor(userName);
    RedirectTo(loginPage);
  }
  return response;
}

As you can see, in the above example, we're passing what we can as method parameters instead of relying on fields. Also, there are almost no operators (even "new" and object access operator ".") - they're hidden inside the "private" methods to whom we delegate the work. These methods don't exist yet - we're just imagining how they should look to fit our needs. The next step is to generate skeleton bodies for "privates" (preferably using an IDE) and fill them in one by one (Of course, a "private" for this method might also be a "sergeant" with its own "privates"). A simple implementation example of a private method might look like this:

private string GetResponseWithLoginErrorsFor(string userName)
{
  return userName + " could not login. Invalid credentials";
} 

Programming By Intention produces code that is very readable - every work-flow step has its domain-specific name. Also, this technique is so powerful, because it separates specification perspective from implementation perspective and, by doing so, leads to a very high method cohesion. Thus, for example, it's very simple to refactor our exemplary "sergeant" later into something like this:

public Response LogIn(
  UserCredentials userCredentials,
  Page homePage, 
  Page loginPage)
{
  Response response = Response.Default();

  if(credentials.AreValid())
  {
    response.SetUpWithSuccessfulLoginMessage();
    homePage.RedirectTo();
  }
  else
  {
    response.SetUpWithLoginErrorsFor(userName);
    loginPage.RedirectTo();
  }
  return response;
}

and achieve class-level cohesion, although the decision whether to pay that cost right now or defer it for later is left to us. Thanks to this, Programming By Intention can be at best almost no-cost (in case of small programs that don't need any maintenance in the future), while at worst allowing us to defer the cost for later while introducing minimal technical debt (in case of more serious pieces of software that need cohesion on every level).

How does it relate to Test-Driven Development?

I must confess that this is the second most powerful technique I know for writing good and easy to read code, the only more powerful being Test-Driven Development. Whenever I am in a situation where for some reasons I cannot use TDD, I use Programming by Intention.

Now, let's examine the similarities between the two techniques first:

  1. Both are outside-in approaches (I'm talking here about the outside-in style of TDD, as it can also be used bottom-up, as e.g. Kent Beck prefers to use it), crafting APIs and method names from the perspective of their use
  2. Both are variants of the divide-and-conquer strategy
  3. Both are about specifying intention through code
  4. Both tend to lead to highly cohesive designs, although TDD demands both class-level and method-level cohesion, while Programming By Intention tends to demand method-level cohesion only (although class-level is easily achievable from there, as in the example above).

Now for the differences. The main one in my opinion is that Programming By Intention looks at writing code from a perspective of a method, while Test-Driven Development looks from the perspective of a behavior. This means several things:

  1. When implementing a single method (as in Programming By Intention), one does not see the whole context of its usage - only this one method. On the other hand, in TDD, when writing a unit-level spec, it usually begins with object instantiation and ends with observable result (either returning a result or call collaborating object's method)
  2. The method perspective also means that, when writing it using Programming By Intention, one has to consider the method as a whole - e.g. all required execution paths (ifs and elses) must be considered throughout the whole implementation process, since they're all usually added in one go. On the other hand, in TDD one only worries about the current path he's specifying with a test. Other paths are either already tied to other tests that will remind us when we unintentionally break them (so we don't have to think about them anymore), or on our TODO list (so we don't have to worry about them yet).

The second point is, IMHO, really important. Few days ago, I finally watched a presentation by Gojko Adzic and Dan North from BDD Exchange. Talking about accelerating Agile, Gojko and Dan stressed very heavily the importance of making a "measurable impact". I think that, when scaled down, this idea aligns well with TDD. The "measurability" of the "impact" we make with our executable specifications can be considered in two ways:

  1. For a single specification: watching the RED to GREEN transition is an act of measuring the impact - reaching the GREEN phase means that you now have something that was not present yet during the RED phase.
  2. For a suite of specifications: the comparison of already implemented specs vs the ones left on the TODO list are a way to measure "how much of the impact" was already made vs what's left.

Summary

Personally, I find many similarities between Programming By Intention and TDD, with Programming By Intention being lower cost (at least in the short run) and easier to learn, while TDD being more powerful and advanced. I dare to say that TDD is Programming By Intention on steroids. Also, I believe that learning Programming By Intention helps in in grasping mock-based Test Driven Development. Hence, if you are struggling with outside-in TDD using mocks, it's a good idea to train Programming By Intention first.

That's all for today, I hope you liked it and see you soon!

Sunday, 6 January 2013

Test Driven Development - objections, part 3

Welcome to part 3 of my post about TDD objections. Today's gonna be the last part of my comments on common objections to TDD. Again, I'm not an expert on these things - I'm merely writing down my thoughts and experiences, so any comment is appreciated! I'd love to know your objections or which ones you're most often getting into, how you deal (or dealt) with them etc.

Where did we leave the last time?

Ok, let's take on the next set of objections from the list:

  1. Not enough time
  2. TDD will slow us down, because we'll have to create a lot of additional code
  3. TDD will slow us down, because we'll have to maintain a lot of additional code
  4. We don't have the necessary skills or experience
  5. We don't have the necessary tools
  6. (This one is unit testing specific) There are already other kinds of tests, we don't need unit tests?
  7. It's very hard to perform with legacy code
  8. I already know how to write a testable code, TDD will not really improve my design
  9. We've got enough quality and we're doing fine without TDD
  10. My manager won't let me do TDD
  11. There's no "scientific proof" and enough research on whether TDD really provides a return of investment

The ones I'm gonna talk about today are the one marked with strong text. Let's go then!

8. I already know how to write a testable code, TDD will not really improve my design

Personally, I know several engineers who actually learned how to write testable code without TDD. They usually added unit tests after code and, over the time, learned to structure the code in such a way that it allows dependency injection, contains small number of private methods etc. If your mates are on this level, it's really not that that bad! However, I observed few deficiencies of such approach when compared to TDD:

  1. It is actually harder to learn how to write testable code without TDD. Also, many hours are wasted on refactoring before one grasps how to do it. In TDD, testability is built-in. So why choose the hard way?
  2. Test-first is often referred to as an "analysis step". You get the chance to take a step back and analyze what should the work-flow be, which behaviors have to be supported etc. Sure, one can use programming by intention to design classes and methods in an outside-in fashion, but when coding, you always see just the current method, or the current statement. In contrast, the unit spec/test forces you to consider whole class from object creation through invoking triggers up to getting a response. This perspective is more useful for analysis (and design) purpose and it is retained each time a new unit-level spec is written.
  3. The benefits of using TDD as a design-aiding tool are not only that you make testable classes that allow dependency injection. The benefits are also in making you focus on just what you actually need, without bothering you with what you don't. In TDD, you state your expectations one at a time, striving for the simplest solution. This leaves less room for over-generic, "just-in-case", framework-like overdesigned structures. Now, don't get me wrong, generic designs are a great and even necessary where genericity is needed and well justified, however, I believe that overdesign is almost as bad as underdesign.
  4. There are other things I have already written about that tend to happen in test-after approach

9. We've got enough quality and we're doing fine without TDD

This was actually responded to by James Grenning on his blog and the response is so good that I have really nothing to add. Go read it.

10. My manager won't let me do TDD

Putting things this way is usually a mental shortcut that might mean different things depending on the context. The most literal interpretation (like receiving a direct order not to do TDD) is almost never the case.

Sometimes, this is merely an excuse. Putting the blame or responsibility on the supervisor is one of the most popular ways of saying "I want someone to take the responsibility instead of me" and may be an example of Cover Your Ass Engineering.

Let me share with you my experience on the topic: in my first project, I thought my boss won't allow me to refactor. But I did - and in the end, no one questioned. In my second project, I thought my boss won't allow me to use smart pointers. But I did - and I was thanked for this. In my third project, I thought my boss won't allow me to use mock objects. But I did - and it was recognized. What's the moral? It's that often it is you who has the authority to make technical decisions. when you put responsibility on people who don't have the authority, they will hold back - this is natural. So go ahead, make the decision and take the responsibility. If it ends with success, take the gratitude. If it fails, take the blame. That's the best way to achieve something.

Sometimes, such complaint may also mean lack of support. In such situations, you can try out two steps that I know help dealing with this:

  1. Build your position - either internally (within the company) or externally (within the wider community) or both. This can be done by participating in open source projects, creating a blog, taking part in conferences, organizing trainings etc. When you are recognized in the community as an expert, your authority rises and people will get persuaded by you more easily.
  2. Instead of hoping your boss will create a plan and make you a part of it, create your own plan and show your boss how he can be a part of it. Show your boss what you want to achieve and what you need from them in order to be able to achieve this.

This whole section put in short is: "you're the one in charge. Believe it!".

11. There's not scientific proofs and enough research on whether TDD really brings any benefits

Oh really...

Let's put it this way - I could really try to convince somebody by listing the references, trying to analyze them, pick some arguments etc. and wait for the other person to say "Ok, I'm convinced" instead of something like "that doesn't convince me, because my case is special and these things do not apply". I could really try and do this.

But I won't. Why?

I won't, because I think this argument is usually raised in highly unfair ways. Ever wondered why anybody needs scientific proof for TDD while they didn't need any for daily stand-ups or four-week sprints, or retrospectives or any other agile or non-agile practices they've already incorporated? Can they show you the scientific research that led them to adopt all these techniques and methodologies?. If so, THEN let's talk about scientific proof for TDD (and it may be a very fruitful discussion). Otherwise, no.

From my observation, the "need for scientific proof" arises when someone simply doesn't want to adopt TDD and is using this argument as a mean to say "I don't wanna do this, unless you prove me that I have no other choice". There may be some reasons why this person doesn't want to try out TDD and they may be VERY good (and worth discussing), but the "scientific proof" argument itself is usually a facade.

Summary

This article ends the series of my comments on common TDD objections. At least for now. I hope you had as much fun reading this as I had writing.

See ya!

Sunday, 30 December 2012

Test Driven Development - objections, part 2

Welcome to part 2 of my post about TDD objections.

By the way, I found an interesting analysis on slideshare on the same topic. It's an interesting read, so be sure to take a look.

Where did we leave the last time?

Ok, let's take on the next set of objections from the list:

  1. Not enough time
  2. TDD will slow us down, because we'll have to create a lot of additional code
  3. TDD will slow us down, because we'll have to maintain a lot of additional code
  4. We don't have the necessary skills or experience
  5. We don't have the necessary tools
  6. (This one is unit testing specific) There are already other kinds of tests, we don't need unit tests?
  7. It's very hard to perform with legacy code
  8. I already know how to write a testable code, TDD will not really improve my design
  9. We've got enough quality and we're doing fine without TDD
  10. My manager won't let me do TDD
  11. There's no "scientific proof" and enough research on whether TDD really provides a return of investment

The ones I'm gonna talk about today are the one marked with strong text. Let's go then!

4. We don't have the necessary skills or experience

This is a valid impediment. However, its a common one in any skill. You see, it's an extremely rare situation that someone is proficient using a technique for the first time (think of the times when you learned how to use a keyboard and a mouse). Most techniques require some time and knowledge to master - the same is with TDD. Let's say that TDD is like a flower, where the core is the "Red-Green-Refactor" cycle and there are a lot of petals - good practices and heuristics. These include: need-driven design, triangulation, mocking, listening to tests etc.

Thankfully, lack of skills can be dealt with by providing training, mentoring, books, doing katas, randoris and by the teams drawing conclusions out of their own experience. In other words, this is a temporary obstacle. Of course, together with staff rotation in your team comes the need to renew part of the investment in skills, knowledge and experience.

5. We don't have the necessary tools

This too is a valid impediment. TDD is a kind of process that can be made a lot easier with tools. There are many areas where tools can improve the overall performance with TDD. These include:

  1. Code navigation
  2. Running unit tests
  3. Automated refactoring
  4. Code Analysis
  5. Quick fixes (like "generate method signature from its use)
  6. Continuous Testing
  7. Code generation

...etc.

If the issue is "money related" (e.g. there are tools on the market, but hey have to be bought), then the management should consider buying the best tools that are on the market and are available within the company budget. Thus, it's always good to have management buy-in for TDD. Thankfully, most of these tools (excluding the ones related to continuous testing) give a performance boost regardless of whether someone is using TDD or not, so many teams have such tools anyway.

If the issue is "technology related" (there are no tools on the market that work well with the technology your team is using), the good news is that TDD is usable even without these tools and still provides a lot of its benefits, since it is about analysis, design and specification. The tools only (well, "only" :-)) help in keeping focus and go faster through the whole cycle.

6. (This one is unit testing specific) There are already other kinds of tests, we don't need unit tests?

Although TDD is not solely limited to unit-level specs/tests, they form a crucial part of it. While some people believe that we can skip higher level specs in TDD, there's no one I know who says you can skip unit level.

Anyway, let's get to the point. this objection is based on a premise that TDD is about testing and "unit tests" produced with it are merely another kinds of tests. When I hear people raising this argument, they talk a lot about coverage, defect discovery, strengthening the testing net etc. Such arguments, for the most part, miss the point, since they ignore the biggest benefits of TDD which do not lie in its testing aspect (which I already discussed).

Dealing with such objections is really hard, because it requires others to accept a different point of view than the one they kept believing in so far. Sure, the books and authorities are on your side, but hey, the guys that read books and listen to authorities don't usually need convincing! So what to do in such case?

Remember, this argument is usually raised in teams that don't even do unit testing, let alone TDD. The following options are available:

  1. Point the opponents to the blog posts and literature - be careful with this. If you do it wrong, the other side may take it as questioning their professionalism. Also, they may just question the authorities you believe in - this is rather easy in software engineering world - they can just say "that doesn't convince me at all" and the game is over. You have to somehow show that you're passionate about TDD and point at such sources as to something that made you passionate. In other words, if you want to lead someone out of their biases, tell them that you too were led out before them and tell them how. This leaves them in a position of "look, there's a way to do things better" instead of "you're all idiots and only I know the truth". I'm hardly an expert on this matter, however, there are some "change patterns" that are worth checking out.
  2. Talk about your experiences from your previous projects if you have any or bring someone else if you don't. There are not many arguments as convincing as "I have experienced it myself", especially to the guys that don't have any experience in a certain field and are in a process of evaluating whether it makes sense to enter this field. Also, some people are more convinced by "soft" arguments ("we did it like this before and everybody in the team said that they feel in improvement in their coding style as never before"), while others are better convinced by "hard" arguments ("we did it like this before and we were able to get the best results of code complexity analysis in the whole project plus we were able to demonstrate a print out of documentation of 40 pages that just came out for free as we did TDD."). Also, better than words are living proofs ("you can just ask those guys I worked with and they show you what our living documentation looks like" or "Just ask the guys that were there with me on how they feel about it").
  3. Create an internal training. This has one big advantage and two small disadvantages. The advantage is that you have a lot of time (at least longer than on regular meetings) to lead people by hand through your argumentation, reasons, examples and so on. In other words, you're given a chance to give more full explanation of the idea. The first disadvantage is that usually you get to prepare for such training in your free time (since management usually approves only spending time on giving the training, not on preparing it). The second disadvantage is that the people that you want to convince can simply ignore the invitation to the training and not attend at all or (if they're forced to attend) they can spend the whole training doing their stuff on their laptops.

7. It's very hard to perform with legacy code

The version of this objection that I find valid is: "It's harder to perform with legacy code". Why? Because TDD requires the code to have a quality called "testability". Legacy code doesn't usually have it. Thus, the code has to be refactored, risking breaking something that already works (if you know how to do this, then the risk is significantly lower, but there's still some risk). On the other hand, there is this temptation to just "insert a little if clause right in the middle of this mess and be done with".

Anyway, the strategy to deal with this objection is to somehow make is clear that it's either "we don't have the necessary skills" or "we don't have the necessary tools" objection, just dressed differently. In the first case, take a look at the following books:

  1. Refactoring by Martin Fowler
  2. Dealing Effectively With Legacy Code by Michael Feathers
  3. Behead Your Legacy Beast. Refactor and Restructure Relentlessly With The Mikado Method (free e-book) by Daniel Brolund and Ola Ellnestam

and in the second case, talk to your boss and make sure that your team has the tools it needs.

Part 3 is already published! Go ahead and read it!

In the meantime, I'll be happy to hear your comments. I'm not an oracle and would gladly learn more about what objections do people receive and how they dispel them.

Thursday, 27 December 2012

Test Driven Development - objections, part 1

Since you're reading this blog, you probably have your own, more or less informed, view on TDD. Maybe you already read a book or two, maybe you've got two (or twelve) years of TDD practice on your back, but maybe you've heard about TDD and about it being "cool" only recently and are merely striving to learn more? If that's the case you've probably got a lot of doubts on whether TDD is really beneficial or whether it will prove beneficial in the specific environment you happen to work in. You're eager to get started, but you wonder whether the time and effort spent on learning TDD will prove itself to be well spent.

During my adventure with TDD, I encountered many objections, either from myself (and they were dispelled by others) or from other engineers (these I tried to dispel myself). Today, I'd like to comment on some of those objections I met with when talking about TDD. In particular:

  1. Not enough time
  2. TDD will slow us down, because we'll have to create a lot of additional code
  3. TDD will slow us down, because we'll have to maintain a lot of additional code
  4. We don't have the necessary skills or experience
  5. We don't have the necessary tools
  6. (This one is unit testing specific) There are already other kinds of tests, we don't need unit tests?
  7. It's very hard to perform with legacy code
  8. I already know how to write a testable code, TDD will not really improve my design
  9. We've got enough quality and we're doing fine without TDD
  10. My manager won't let me do TDD
  11. There's no "scientific proof" and enough research on whether TDD really provides a return of investment

This post is around the first three points, marked with strong text. The later points will be discussed in part 2 of this post. Ok, let's go!

1. Not enough time

This is my favorite objection, just because dispelling it is so fun and easy. I usually approach this kind of objections with Gojko Adzic's argument on how to solve Not Enough Time. You see, "not enough time" is always a facade reason - it is there only to hide the true one. The only direct solution to "not enough time" is to "make more time", which, by the way, is impossible. Luckily, we can restate it into something solvable like "something prevents me from allocating time for TDD". The further actions depend on what is this "something". Maybe it's a boss that will punish their employees for not fulfilling the short term goals? Maybe it's lack of appropriate knowledge or training? Anyway, these issues can be dealt with. "Not enough time" can't.

2. TDD will slow us down, because we'll have to create a lot of additional code

This is actually a (partially) valid argument. TDD really does lead to creating more code, which costs time. However, this does not necessarily mean that the overall development process takes more time than not writing this additional code. This is because this "test code" is not being written just for the sake of "being written" :-). The act of writing and the code that is created as a result of this writing provide a great value to a single developer and an entire team.

TDD aids developers in the process of analysis. What would otherwise be a mental effort to make everything up inside your head, turns into concrete failing or impossible-to-write code. This generates questions that are usually better asked sooner than later. Thanks to this, there's no escape from facing uncertainty.

The case is different (at least for me) when developing without TDD. In this approach, I discovered that I tend to write what I know first, leaving what I don't know for later (in hope that maybe I'll learn something new along the way that will answer my questions and lower uncertainty). While the process of learning is surely valuable, the uncertainty must be embraced instead of being avoided. Getting away from answering some key questions and leaving them for later generates a lot of rework, usually at the end of iteration, where it's very dangerous.

When I do TDD and I encounter a behavior that I don't know how to specify, I go talk with stakeholders ("Hey, what should happen when someone creates a subscription for magazines already issued?"). Also, If someone asks me to clarify my question, I can show them the spec/test I'm trying to write for the behavior and say "look, this does not work" or "so the output should be what?". This way, I can get many issues if not solved, then at least on the table much sooner than in case of non-TDD coding. It helps me eliminate rework, save some time and make the whole process more predictable.

TDD also provides a big help when designing. It lets you do your design outside-in, beginning with the domain and its work-flows, not the reusable, versatile, framework-like, one-size-fits-all utility objects that end up having only 30% of its logic ever used. This way, TDD lets you avoid over-design (by the way, code reuse is a good thing, it's just that preliminary generalization is as dangerous as preliminary optimization).

On the other hand, by promoting a quality called "testability", it promotes loose coupling, high cohesion and small, focused, well encapsulated objects. I already did some posts on that, so I'm not gonna delve more into this topic here. Anyway, striving for high testability helps avoid under-design.

Another way to think about the process of doing TDD is that you're actually documenting your design, its assumptions, legal and illegal behaviors. Others will be able to read it and learn from it when they face the task of using your abstractions in a new context.

3. TDD will slow us down, because we'll have to maintain a lot of additional code

It's true that, when done wrong, TDD produces a mass of code that is hard to maintain. Refactoring becomes a pain, each added functionality breaks dozens of existing specs/tests and the teams seriously consider abandoning the practice.

This happens more often when the teams do not adopt TDD, but rather stick with unit tests and do it only for the sake of testing.

The truth is that TDD, when done right, helps avoid such situations. Also, this help is actually one of its goals! To achieve this, however, you need two things.

The first one is knowing TDD good practices that help you write only the specs/tests that you need, focus on behavior and discover interactions between different objects, limiting an impact of a change to a small number of specs/tests. I actually didn't write about it yet on my blog, but there are some other sources of information. Anyway, this issue is easily solvable by a combination of training, mentoring, books and experience.

the second thing is "listening to the tests", covered both by Steve Freeman & Nat Pryce (they call it Synaesthesia) and Amir Kolsky & Scott Bain (they call it Test Reflexology). The big idea is that difficulties in writing and maintaining specs/tests are a very much desired feedback on quality of your design (also make sure to look at James Grenning's post on TDD as a design rot radar).

In other words, as long as the design is good and you know how to write tests/specs, this whole objection is not a problem. Of course, there is still a code to maintain, but I found it to be an easy task.

Another thing to keep in mind is that by maintaining specs/tests, you're actually maintaining a living documentation on several levels (because TDD is not solely limited to unit level). Just think how much effort it takes to keep an NDoc/JDoc/Doxygen documentation up to date - and you never actually know whether such documentation speaks the truth after a year of maintenance. Things get better with tests/specs, which can be compared with the code just by running them, so the maintenance is easier.

Part 2 is already written! Read it!

Also, feel free to leave your comment. How do you deal with these objections when you encounter them? Do you have any patterns you can share?