Friday, 14 December 2012

Reusing code in Test Data Builders in C#

This is a C# version of a post I did a while ago. Hope it's of any use.

Test Data Builders

Sometimes, when writing unit or acceptance tests, it's a good idea to use Test Data Builder. For example, let's take a network frame that has two fields - one for source, one for destination. A builder for such frame could look like this:

public class FrameBuilder
{
  private string _source;
  private string _destination;

  public FrameBuilder Source(string newSource)
  {
    _source = newSource;
    return this;
  }

  public FrameBuilder Destination(string newDestination)
  {
    _destination = newDestination;
    return this;
  }

  public Frame Build()
  {
    var frame = new Frame()
    {
      Source = _source,
      Destination = _destination,
    };
    return frame;
  }
}

and it can be used like this:

var frame = new FrameBuilder().Source("A").Destination("B").Build();

The issue with Test Data Builder method reuse

The pattern is fairly easy, but things get complicated when we have a whole family of frames, each sharing the same set of fields. If we wanted to write a separate builder for each frame, we'd end up duplicating a lot of code. So another idea is inheritance. However, taking the naive approach gets us into some trouble. Let's see it in action:

public abstract class FrameBuilder
{
  protected string _source;
  protected string _destination;
  
  public FrameBuilder Source(string newSource)
  {
    _source = newSource;
    return this;
  }
  
  public FrameBuilder Destination(string newDestination)
  {
    _destination = newDestination;
    return this;
  }
  
  public abstract Frame Build();
};

public class AuthorizationFrameBuilder : FrameBuilder
{
  private string _password;
  public AuthorizationFrameBuilder Password(string newPassword)
  {
    _password = newPassword;
    return this;
  }
  
  public override Frame Build()
  {
    var authorizationFrame = new AuthorizationFrame()
    {
      Source = _source,
      Destination = _destination,
      Password = _password,
    };
    return authorizationFrame;
  }
}

The difficulty with this approach is that all calls to FrameBuilder methods return a reference to FrameBuilder, not an AuthorizationFrameBuilder, so we cannot use calls from the latter after calls from the first. E.g. we cannot make a chain like this:

new AuthorizationFrameBuilder().Source("b").Password("a").Build();
  
This is because Source() method returns FrameBuilder, which doesn't include a method called Password() at all. Such chains cause compile errors.

Generics to the rescue!

Fortuntely, there's a solution for this. Generics! Yes, they can help us here, but in order to do this, we have to use a trick that in C++ world is called "Curiously Recurring Template Pattern" (don't know if it has any name in the C# world). By using the trick, we'll force the FrameBuilder (superclass) methods to return reference to its subclass (AuthorizationFrameBuilder) - this will allow us to mix methods from FrameBuilder and AuthorizationFrameBuilder in any order in a chain, because each method, not matter which of the classes it is defined in, returns a reference to a subclass.

Let's see how this works out in the following example:

//thankfully, the following is legal :-)
public class FrameBuilder<T> where T : FrameBuilder<T>
{
  protected string _source;
  protected string _destination;

  public T Source(string newSource)
  {
    _source = newSource;
    return this as T;
  }
  
  public T Destination(string newDestination)
  {
    _destination = newDestination;
    return this as T;
  }
}

public class AuthorizationFrameBuilder 
  : FrameBuilder<AuthorizationFrameBuilder>
{
  string _password;

  public AuthorizationFrameBuilder Password(string password)
  {
    _password = password;
    return this;
  }
  
  public AuthorizationFrame Build()
  {
    var frame = new AuthorizationFrame()
    {
      Source = _source,
      Destination = _destination,
      Password = _password,
    };
    return frame;
  }
}

Note that in FrameBuilder, the this pointer is cast to its generic type, which happens to be the sublass on which the methods are actually called. this cast is identical in every method of FrameBuilder, so it can be captured inside a separate method or property like this:

  T This 
  {
    get { return this as T; }
  }
    
  public T Source(string newSource)
  {
    _source = newSource;
    return This;
  }

Summary

This solution makes it easy to reuse any number of methods in any number of different builders, so it's a real treasure when we've got many data structures that happen to share some common fields.

That's all for today - Hope you enjoy the C#-specific version of the solution. By the way, in contrast to C++, C# allows one or two solutions other than chained methods to achieve the same data-building result. Can you name them?

No comments: