Sunday 26 August 2012

Mocking method chains part 2: types of chains and fluent interfaces

Hi, today, I’m going to continue discussion I already started on mocking and specifying call chains. Last time, I discussed when call chains are a symptom of bad design and bad coding that violates few object oriented design principles. This time, I’m gonna take on situations when call chains are a good idea or even a sophisticated design technique.

Basically, the three groups we can distinguish (by the criteria of how they’re mocked), are:

  1. Unordered chains
  2. Ordered chains
  3. Chains with grammar

The post concludes of a brief discussion of the terminology and how it relates to the term ‘Fluent Interface’ and ‘Internal Domain Specific Language’, since it seems that these terms are often confused with call chaining.

But for now, let’s focus on the three categories:

1. Unordered chains

Let’s imagine that our system sends frames through a network interface and the frame has to be built according to a specified format. To allow building such frames easily, we introduce a builder pattern implementation to hide the format and allow us to specify each field freely. The builder can be used like this:

var frame = new FrameWithParameters()
  .Speed(10)
  .Age(1)
  .DestinationIp(“192.168.0.2”)
  .Build();

This is a situation where we don’t care what is the order of calls chained. Quite the contrary - when using the builder, we’re relieved of having to remember all the parameters, how many of them are, which ones go to constructor and which ones are set with setters. Also, we let the builder supply default parameters for the values we don’t specify. So we can build another frame like this:

var frame = new FrameWithParameters()
 .DestinationIp(“192.168.0.4”)
 .Age(2) 
 .Build();

As you can see neither the order of the calls nor the use of all the available calls is necessary.

Just as in the example from part 1, this kind of API is invented as a shortcut to group multiple invocations under one object and make this grouping obvious. The difference is that here, the invocations are NOT made in order to browse through object’s dependencies, but to reduce the “noise” that would result from specifying the object name before each method call.

Mocking unordered chains.

Before I show you how unordered chains can be mocked/specified, we need to take a look how such objects are usually implemented. We’ll use the builder example, but modify it so that the FrameWithParameters is an interface rather than a concrete class:

public interface FrameWithParameters 
{ 
  FrameWithParameters Speed(int value); 
  FrameWithParameters Age(int value); 
  FrameWithParameters DestinationIp(string value); 
  Frame Build(); 
}

And the implementation would look like this:

public class ProtocolXFrameWithParameters 
  : FrameWithParameters
{ 
  int speed = 1; 
  int age = 0; 
  string destinationIp = “127.0.0.1”;

  FrameWithParameters Speed(int value) 
  { 
    this.speed = value; 
    return this; 
  }

  FrameWithParameters Age(int value) 
  { 
    this.age = value; 
    return this; 
  }

  FrameWithParameters DestinationIp(string value) 
  { 
    this.destinationIp = value; 
    return this; 
  }
  
  Frame Build() 
  { 
    var frame = new XProtocolFrame(destinationIp); 
    frame.Age = this.age; 
    frame.Speed = this.speed; 
    return frame; 
  } 
}

So as you can see, all the methods responsible for configuration return the same object that they were called on. When mocking such interfaces, we have to preserve this behavior - then we’ll be able to verify all calls on the same mock.

Thankfully, many mocking frameworks allow us to make a single setup for all methods returning a given type. An example for Moq framework:

var builderMock = new Mock<FrameWithParameters>();
builderMock.SetReturnsDefault<FrameWithParameters>(builderMock.Object);

And FakeItEasy framework:

var fakeBuilder = A.Fake<FrameWithParameters>(); 
A.CallTo(fakeBuilder)
  .WithReturnType<FrameWithParameters>()
  .Returns(fakeBuilder);

As far as I know, NSubstitute does not offer such feature, but you’re welcome to prove me wrong in the comments section. For all mocking frameworks that do not support this feature (and for manual mocks), you have to setup return value for each method separately. Feel free to do this in a helper method, because extracting such code from the direct specification code will not hinder readability. Example of such helper method for NSubstitute:

static void UnorderedCallChainingIsSupportedBy
  (FrameWithParameters builder) 
{ 
  builder.Speed(Arg.Any<int>()).Returns(builder); 
  builder.Age(Arg.Any<int>()).Returns(builder); 
  builder.DestinationIp(Arg.Any<string>()).Returns(builder); 
}

Ok, that’s pretty much it. Now for ordered chains.

2. Ordered chains

These is a case when making the calls in a different order gives different semantic results. Let’s take a look at a simplified example of stream parser. The stream consists of fields, each having a name and a fixed length. The role of the parser is to chop off subsequent fields from the stream and place them in a dictionary.

stream
  .CHAR(“Author”, 20) 
  .INT(“Format”, 12) 
  .DATE(“Created”, 8) 
  .BIN(“Content”, 300);

Note that from the API design point of view, nothing prevents us from switching the order of invocations, but if we did that, the stream would be split in a different way, so the final effect of stream processing would be different - the fields would have different values.

Mocking this kind of invocations is sometimes easier, sometimes harder than unordered chains - it all depends on which framework you use. So, for example, verifying the stream parser call chain described above in Moq looks like this:

var streamMock = new Mock<IParser>() 
{ 
  DefaultValue = DefaultValue.Mock 
};

// perform some actions that result in call chain

streamMock.Verify(m => m
  .CHAR(“Author”, 20) 
  .INT(“Format”, 12) 
  .DATE(“Created”, 8) 
  .BIN(“Content”, 300));

FakeItEasy does not allow using recursive mocks in verification, but you can use the same technique as with unordered chains, but include ordered assertions (google ‘fakeiteasy ordered assertions’).

NSubstitute only allows stubbing recursive calls (at least with Mono), so one can use a small trick to work around this limitation:

var streamMock = Substitute.For<StreamParsing>(); 
var callSequence = Substitute.For<StreamParsing>();

streamMock
  .CHAR(“Author”, 20) 
  .INT(“Format”, 12) 
  .DATE(“Created”, 8) 
  .Returns(callSequence);

someLogic.PerformOn(streamMock);

callSequence.Received().BIN(“Content”, 300);

So as you see, we can stub the whole sequence except the last step. This sequence returns another mock that we use to verify the just this last step. A little bit awkward, but does the job.

To sum up, ordered chains need some non-standard tricks depending on what mocking framework you use. Usually, however, one can find a way to get the desired result.

3. Chains with grammar

Conceptually, this is the most interesting chaining technique. It is used to build API that tries to mimic sentences (and as you know, in a sentence, not every word is legal in every place. There are rules that define what can be used where in a sentence and these rules are called “grammar”). Let’s take a look at the following example:

api.Method(“SetMaximumUserCount”)
  .Requires(Constraint.AtLeastOneOf) 
  .Role(“PowerUser”) 
  .Role(“Administrator”);

Note that the above chain makes sense: invocation of a method called “SetMaximumUserCount” requires having at least one of the two roles specified: either a power user or an administrator. However, the following chain makes abosolutely no sense:

api.Role(“PowerUser”) 
  .Method(“RoleSetMaximumUserCount”) 
  .Requires(Constraint.AtLeastOneOf)
  .Role(“Administrator”);

The situation is different than in ordered chains, where any chain made sense, but different chains resulted in different behaviors. Here, we have “legal sentences” (legal chains) that produce sensible result and “illegal sentences” (illegal chains) that do not make any sense at all. What we want to do is to prevent users of our API from creating incorrect chains, i.e. enforce some kind of “grammar”, preferably at compile time than at runtime.

The easiest way to do this is to distribute the methods into several types, so that a single call in the chain returns a result that contains only the methods legal in its context. For the example above, it could look like this:

public interface WebApi 
{ 
  WebMethodConfiguration Method(string name); 
}

public interface WebMethodConfiguration 
{ 
  RequirementConfiguration Requires(Constraint c); 
}

public interface RequirementConfiguration 
{ 
  RequirementConfiguration Role(string name); 
}

This way, the object returned by previous call creates a context for the next call. E.g. calling Requires() twice in a chain is impossible - such code would not compile.

Mocking chains with grammar

Note that in the example above, the order of calls is mostly enforced by the “grammar”, but the last two calls are an exception - we want to be able to specify as many roles as we wish and the order of these role specifications is irrelevant. The conclusion is that parts of the chains in grammar can be ordered as enforced by the “grammar”, parts can be ordered as enforced by the semantics of produced code, and parts can be unordered.

From mocking perspective, this is just scaling the solutions described in the first two sections to span several interfaces. Sometimes, you may even want to specify unordered calls as ordered if it makes your live easier - the penalty is having inaccurate documentation, since anyone looking at the unit test may think that the order is actually required where it’s not.

Chains and Fluent Interfaces

The last thing I’d like to clarify is why I avoid the term Fluent Interface throughout this post. The main reason is that this term is a bit misunderstood among the novice programmers and it doesn’t matter so much from the perspective of mocking. They often make the simplification that the “fluency” in fluent interfaces is about the chaining itself. As I said, this is a misunderstanding. Method chaining is just one of the techniques to achieve fluent interfaces. Other techniques include: function sequence, function nesting, lambdas, closures, operator overloading, annotations/attributes, abstract syntax tree manipulation etc.

In general, there are three terms that need to be sorted out:

  1. Fluent Interface - an API created with “flow” and readability in mind, rather than commands and queries
  2. Embedded Domain Specific Language (Internal Domain Specific Language) - a type of API implemented in a general purpose programming language that reads like a mini-language of its own. This mini-language is targeted at a particular domain of problems (see rake syntax for example), i.e. mimics a language used by a domain expert as much as possible
  3. Method Chaining (Call Chaining) - described earlier

So, an API does not have to be all three at the same time. Embedded domain specific languages are (almost?) always created as fluent interfaces, because they have to be readable to the domain expert and the goal of fluent interfaces is readability. Fluent interfaces often use method chaining, but this is not always true. Also, fluent interfaces may use more than one technique at the same time for achieving “fluency”, call chaining being just one of them.

There is a lot of places on the internet where you can deepen your knowledge on this topic. If there is a demand, however, I can expand on what I wrote here, but for now, I’ll hold back.

Wrapping it up

Wow, this series was a great challenge both in terms of scope and teaching skills. If you think there are some improvements to be made or see some errors, please let me know in the comments, so that I can clarify or correct the material.

See ya!

Wednesday 15 August 2012

Mocking method chains part 1: When not to. Coupling, encapsulation, redundancy and the Law of Demeter.

Today I'd like to start discussing mocking of sequences of method call chains. This discussion will be split in two parts, part two coming soon.

The basics of call chains

When I say "call chain", I mean something like this:

a.X().Y().Z().V();

So basically we're just invoking methods on a result of another method. This is a construct that probably everyone programming in an object oriented language have come across, so I'll only note that I don't mean "call chain" merely as a synonym of invoking method on returned object (because it would be plain silly to make up a name for something that obvious), but rather the style of invoking methods "in a chain" as seen in the example above.

Call chains may be a design smell

Now, one thing I want to make clear is that there's a point in mocking method chains only in specific situations. This is because call chains can be either a design smell or a sophisticated API design technique. Today, I'm gonna talk a little bit about the first case, leaving the latter for part two.

Let's take a look at the following example of call chain being just a design smell (we're gonna discuss why in a minute):

bool wasWrittenInAmericanEnglish = message
  .GetWrappedFrame()
  .GetHeader()
  .GetLocale() == "en-US";

So, what's really wrong with this? I can guess that you've probably seen code like this many times before. The issue with the example above is that we're now coupled to all of the types of objects between the message object and the locale string. Also, we're coupled to the information that message language can be obtained by locale, as well as to the name of the en-US locale and so on.

Moreover, we're coupled to if and when will subsequent calls return valid objects. Let's say that there are some messages that do not contain a header and we can't obtain a locale information. We decide that we want to use application's default locale in such case - this decision will have to be properly handled by this code. If this isn't the only place where we need to follow this default locale policy, we've already introduced redundancy (oh, and did I say that we've violated encapsulation by failing to protect the information on how the locale is obtained?). Another example might be adding support for messages without locale specified in the header.

Here's how ugly may this kind of code turn when only these two constrains are loosened:

const string EnUs = "en-US";
bool wasWrittenInAmericanEnglish 
  = EnUs == getCurrentAppLocale();

var wrappedFrame = message.GetWrappedFrame();

if(wrappedFrame.HasHeader())
{
  var header = message.GetHeader();
  var locale = header.GetLocale();
  if(locale != null)
  {
    wasWrittenInAmericanEnglish = locale == EnUs;
  }
}

See how the code was impacted by broken encapsulation? By the way, this is also a violation of Law of Demeter.

As a side note, the issue does not have anything to do with the calls being chained. It may as well look like this without it changing the problem at all:

var frame = message.GetWrappedFrame();
var header = frame.GetHeader();
var locale = header.GetLocale();
var wasWrittenInAmericanEnglish = locale == "en-US";

So the chain in such smelly situations is rather a workaround than an API design technique. The workaround is for typing too much and dealing with too many variables that are only used to obtain other variables.

The diagnosis and the cure

We mainly run into this kind of code in four situations:

  1. We introduce it during bottom-up coding (because we have access to all the data wee need, just need to "reach it through the dots")
  2. We introduce it when not using TDD, because there is nothing to tell us loud enough that it's wrong
  3. We introduce it by abusing mocking frameworks (whose authors usually add support for mocking of method chains, but not with such situations in mind)
  4. We encounter it in legacy code

In any case, you don't want to mock the methods chain. I mean, really really, you don't want to mock it. Or specify it.

What you DO want to do is to refactor it into something like this:

bool wasWrittenInAmericanEnglish 
  = message.WasWrittenIn(Languages.AmericanEnglish);

Here, the code is decoupled from all the information about the encapsulated calls - it doesn't know the structure of message object or the locale format used, or even that the check is made by comparing locale values of any sort. We're safe and sound.

Also, it's very straightforward to mock, even with manual mocks. That's one of the reasons some of the TDD gurus suggest to use manual mocks (at least when getting started) - they provide more of the diagnostic pain that points where the design is going wrong.

And how to avoid running into such smelly chains when writing new code? My favorite technique is to use Need Driven Development, at the same time forgetting about the call chain mocking feature of some of the mocking frameworks.

Ok, that's it for today, in the next installment, we'll take a look at some proper usage of call chains and three classes of such chains: ordered chains, unordered chains and chains with grammar. Stay tuned!