Wednesday, 15 August 2012

Mocking method chains part 1: When not to. Coupling, encapsulation, redundancy and the Law of Demeter.

Today I'd like to start discussing mocking of sequences of method call chains. This discussion will be split in two parts, part two coming soon.

The basics of call chains

When I say "call chain", I mean something like this:


So basically we're just invoking methods on a result of another method. This is a construct that probably everyone programming in an object oriented language have come across, so I'll only note that I don't mean "call chain" merely as a synonym of invoking method on returned object (because it would be plain silly to make up a name for something that obvious), but rather the style of invoking methods "in a chain" as seen in the example above.

Call chains may be a design smell

Now, one thing I want to make clear is that there's a point in mocking method chains only in specific situations. This is because call chains can be either a design smell or a sophisticated API design technique. Today, I'm gonna talk a little bit about the first case, leaving the latter for part two.

Let's take a look at the following example of call chain being just a design smell (we're gonna discuss why in a minute):

bool wasWrittenInAmericanEnglish = message
  .GetLocale() == "en-US";

So, what's really wrong with this? I can guess that you've probably seen code like this many times before. The issue with the example above is that we're now coupled to all of the types of objects between the message object and the locale string. Also, we're coupled to the information that message language can be obtained by locale, as well as to the name of the en-US locale and so on.

Moreover, we're coupled to if and when will subsequent calls return valid objects. Let's say that there are some messages that do not contain a header and we can't obtain a locale information. We decide that we want to use application's default locale in such case - this decision will have to be properly handled by this code. If this isn't the only place where we need to follow this default locale policy, we've already introduced redundancy (oh, and did I say that we've violated encapsulation by failing to protect the information on how the locale is obtained?). Another example might be adding support for messages without locale specified in the header.

Here's how ugly may this kind of code turn when only these two constrains are loosened:

const string EnUs = "en-US";
bool wasWrittenInAmericanEnglish 
  = EnUs == getCurrentAppLocale();

var wrappedFrame = message.GetWrappedFrame();

  var header = message.GetHeader();
  var locale = header.GetLocale();
  if(locale != null)
    wasWrittenInAmericanEnglish = locale == EnUs;

See how the code was impacted by broken encapsulation? By the way, this is also a violation of Law of Demeter.

As a side note, the issue does not have anything to do with the calls being chained. It may as well look like this without it changing the problem at all:

var frame = message.GetWrappedFrame();
var header = frame.GetHeader();
var locale = header.GetLocale();
var wasWrittenInAmericanEnglish = locale == "en-US";

So the chain in such smelly situations is rather a workaround than an API design technique. The workaround is for typing too much and dealing with too many variables that are only used to obtain other variables.

The diagnosis and the cure

We mainly run into this kind of code in four situations:

  1. We introduce it during bottom-up coding (because we have access to all the data wee need, just need to "reach it through the dots")
  2. We introduce it when not using TDD, because there is nothing to tell us loud enough that it's wrong
  3. We introduce it by abusing mocking frameworks (whose authors usually add support for mocking of method chains, but not with such situations in mind)
  4. We encounter it in legacy code

In any case, you don't want to mock the methods chain. I mean, really really, you don't want to mock it. Or specify it.

What you DO want to do is to refactor it into something like this:

bool wasWrittenInAmericanEnglish 
  = message.WasWrittenIn(Languages.AmericanEnglish);

Here, the code is decoupled from all the information about the encapsulated calls - it doesn't know the structure of message object or the locale format used, or even that the check is made by comparing locale values of any sort. We're safe and sound.

Also, it's very straightforward to mock, even with manual mocks. That's one of the reasons some of the TDD gurus suggest to use manual mocks (at least when getting started) - they provide more of the diagnostic pain that points where the design is going wrong.

And how to avoid running into such smelly chains when writing new code? My favorite technique is to use Need Driven Development, at the same time forgetting about the call chain mocking feature of some of the mocking frameworks.

Ok, that's it for today, in the next installment, we'll take a look at some proper usage of call chains and three classes of such chains: ordered chains, unordered chains and chains with grammar. Stay tuned!


Anonymous said...

bool wasWrittenInAmericanEnglish
= message.WasWrittenIn(Languages.AmericanEnglish);

What is the implementation if WasWrittenIn method?

If it is
return this
.GetLocale() == language

Then we just moved chaining calls to some other place

Grzegorz Gałęzowski said...

Thanks for the comment!

I was hoping somebody asked this. The implementation of WasWrittenIn() might be e.g.:


I know this answer might upset you and the next question you would ask would be: "so all we do is add a delegating method at each level? This is stupid! The chaining is still there, but hidden beneath all these methods".

This however might pass when we realize what we are trying to accomplish and what we are not:
- we are not trying to eliminate dependency of message on frame.
- we are not trying to eliminate dependency of frame on header.
- we are not trying to eliminate dependency of header on locale.
- we ARE trying to hide all the dependencies from the outside world (which is using message), so that this outside world cannot couple to them and the order they are structured in, so that we can change all of this more freely.

As the most basic step, even "just moving the chain" as you suggested might do the job. While it might sound silly at first, it will let us in e.g. 30 places where we need to check the language of the message, use:


instead of:

.GetLocale() == language

Now let's imagine that the message structure changes so that it has more than one header. If we didn't "just move the chain", all these 30 places would need to be updated to e.g.

.GetLocale() == language

"Just moving the chain" drops down the number of places we need to change from 30 to 1.

Of course, in such case we're still left with the rest of the chain:

return this
.GetLocale() == language

which may cause the same kind of problem which we can solve by applying the same technique again.

So, what we're trying to achieve is not eliminating the dependencies, but reduce coupling.

Does that answer your concern?

Vladimir said...

Hello, thanks for answer.

Yes and no actually. You have told about removing duplication - always a good thing, but irrelevant to chaining calls.

More, by implementing WasWrittenIn, we introduce dependency from header and locale to message. Before it, message knew only about WrappedFrame.

So, the question is - even if we move that logic to separate method, and, because it is still some logic, it is good to have tests for it. And when I started to write test for similar code, I felt quite silly - I created mocks that return mocks and check, that they are called. I am not actually sure, what I am testing here.

So, could you answer please, if you have function bool Message.WasWrittenIn(Language land), and that possibly will be implemented as return this
.GetLocale() == language, what tests do you write for it, and what do you want to test here?


Vladimir said...

Let me comment about this code:
as implementation of Message.WasWrittenIn.

I suppose, further code looks like
bool Frame.IsMarkedAsSentIn(Language language) { return GetHeader().IsMarkedAsSentIn(language); }
bool Header.IsMarkedAsSentIn(Language language) { return GetLocale() == language; }

These methods now don't know about call chain, but they all know about language, and pass it to each other.

Lets imagine, that GetLocale changed, and now we need an extra parameter for it, for example user.

Now code will look like this:
bool Message.WasWrittenIn(Language language, User user) { return GetWrappedFrame().IsMarkedAsSentIn(language, user); }
bool Frame.IsMarkedAsSentIn(Language language, User user) { return GetHeader().IsMarkedAsSentIn(language, user); }
bool Header.IsMarkedAsSentIn(Language language, User user) { return GetLocale(user) == language; }

So we had to add new parameter to this call chain, even only GetLocale changed.

So I think, this also not a good solution.

But here is a question - what is a good solution then?

Grzegorz Gałęzowski said...

Vladimir, Here is a comment for your first reply:

"Yes and no actually. You have told about removing duplication - always a good thing, but irrelevant to chaining calls"

Actually, my point was that having a chain visible in a single place might be a sign of excessive coupling to intermediary objects. Excessive coupling is usually a marker of a missing abstraction. Missing abstraction usually leads to redundancy. I don't know what you tried to say by "irrelevant to chaining calls", but it's perfectly relevant to the context I was discussing chaining in. And remember I was talking about a chain occurring in a single place, not about the fact that dependencies between elements of the chain exist.

I think that the example I gave you with my previous answer was a little faulty, because even if you had such duplication as I indicated, the compiler would tell you this. A better kinds of change might be:
a) some of the intermediary object might sometimes be null
b) The passed locale would have to be compared with two values instead of one (e.g. the frame now holds "EN" in a separate field and "us" in a separate one)

As for the question on what I would mock:

Let us say that it only makes sense to use frame, header and locale within an abstraction of message (i.e. header is always considered as part of message and is not used on its own). If this was so, then I'd probably not mock frame and header, but just use real data. The "chain" would be known only internally to the Message object, so it would give me a lot of flexibility to change my mind in the future if e.g. I needed to pass some internal data to a strategy at some point.

If Frame and Header were not just data partitions but objects that should be able to stand for themselves, I'd probably take the route that you described in the second comment and in tests for frame, I would stub header. In tests of header, I would use the real locale, because it is a number and header is directly responsible for managing this part of data (which of course might change as the design evolves, e.g. locale may be a very rich abstraction in some systems).

Grzegorz Gałęzowski said...

As for your second comment:

I'd change the implementation of:
bool Header.IsMarkedAsSentIn(Language language, User user) { return GetLocale(user) == language; }


bool Header.IsMarkedAsSentIn(Language language, User user) { return language.Is(this.GetLocale()); }

Having an abstraction allows us to decouple further. Let's assume that Language is a class.

If so, and if really a "user" is a concept tied to language checking (not likely in real world, but that doesn't matter so much), then we may actually construct language like this:

language = Language.AmericanEnglishFor(user);

and still pass a single parameter through the intermediary methods.

Of course, this has nothing to do with the chain itself - having everything based on getters and introducing the Language abstraction alone would let us use similar solution.

And of course, we have introduced coupling on language, but:
1. Message.WasWrittenIn() is only coupled to the Language and Frame.
2. Frame.IsMarkedAsSentIn() is only coupled to the Language and Header.
3. Header.IsMarkedAsSentIn() is only coupled to Language and Locale
4. The outside world is coupled to Language and Message

Note that no entity has intermediary dependencies - it knows only what it needs to know. This allows us to e.g.
a) Make outside world work with two different subclasses of frame, one of which has structure of


and the second having:


b) we can reuse the entities better. E.g. a different frame might have the same structure, but only the locale may contain multiple possible locales if it is a multi-language message. Still, we are able to reuse the Message class, and the Header class, but change the locale to an object of another class thanks to polymorphism.

Which kind of coupling is better? Ha, tough question! For sure, incidental coupling is always bad (and navigating intermediary objects only to get to the last one is an example of this).

One last thing to note is that I could as well solve this another way:

string Message.GetLocale(){ return GetFrame().GetLocale(); }
string Frame.GetLocale(){ return GetHeader().GetLocale(); }
string Header.GetLocale() { return this.locale; }

and I would also avoid the coupling to the whole chain. My experience is that returning values often leads to the feature envy smell, but in a concrete, real-life scenario, the choice between the two alternatives is, of course, dependent on the forces in the current problem domain (e.g. what's stable and what's variable in the design or what's under your control and what's not etc.).

Grzegorz Gałęzowski said...

ugh, one mistake...

"I would stub header. In tests of header, I would use the real locale, because it is a number " - it is a strong, not a number, sorry.