Sunday 23 June 2013

How to start with a test, not implementation - part 1

(this post is adapted from my work-in-progress open source TDD tutorial)

Note that I use "Statement" instead of "test" and "Specification" instead of "Test Suite" in this post

Part 2 of this series is ready!

Whenever I sat down with a person that was about to write their first code in a Statement-first manner, the person would first stare at the screen, then at me, then would say: "what now?". It's easy to say: "You know how to write code, you know how to write a unit test for it, just this time start with the latter rather than the first", but for many people, this is something that blocks them completely. If you're one of them, don't worry - you're not alone. I decided to dedicate this series of posts solely to techniques for starting to write a Statement when there's no code. I do not use mocks in this series on purpose, to keep it simple. However, all of these techniques are proven to work when using mocks.

Start with a good name

It may sound obvious, but really some people are having trouble describing the behavior they expect from their code. If you can name such behavior, it's a great starting point.

I know some people don't pay attention to naming their Statements, mainly because they're considered as tests and second-level citizens - as long as they run and "prove the code does not contain defects", they're considered sufficient. We'll take a look at some examples of bad names and then I'd like to introduce to you some rules of good naming.

Consequences of bad naming

As I said, many people don't really care how their Statements are named. This is a symptom of treating the Specification as garbage or leftovers - something that just "runs through your code". Such situation is dangerous, because as soon as this kind of thinking is established, it leads to bad, unmaintainable Specification that looks more like lumps of accidental code put together in a haste than a living documentation. Imagine that your Specification consists of names like this:

  • TrySendPacket()
  • TrySendPacket2()
  • testSendingManyPackets()
  • testWrongPacketOrder1()
  • testWrongPacketOrder2()

and try for yourself how difficult it is to answer the following questions:

  1. How do you know what situation each Statement describes?
  2. How do you know whether the Statement describes a single situation, or few of them at the same time?
  3. How do you know whether the assertions inside those Statements are really the right ones assuming each Statement was written by someone else or was written a long time ago?
  4. How do you know whether the Statement should stay or be removed when you modify the functionality it specifies?
  5. If your changes in production code make a Statement evaluate to false, how do you know whether the Statement is no longer correct or the production code is wrong?
  6. How do you know whether you will not introduce a duplicate Statement for a behavior that's already specified by another Statement when adding to Specification originally created by another team member?
  7. How do you estimate, by looking at the runner tool report, whether the fix for failing Statement will be easy or not?
  8. How do you answer a new developer in your team when they ask you "what is this Statement for?"
  9. How can you keep track of the Statements already made about the specified class vs those that still need to be made?

What's in a good name?

For the name of the Statement to be of any use, it has to describe the expected behavior. At minimum, it should describe what happens at what circumstances. Let's take a look at one of the names Steve freeman and Nat Pryce came up in their great book Growing Object Oriented Software Guided By Tests:

notifiesListenersThatServerIsUnavailableWhenCannotConnectToItsMonitoringPort()

Note few things about the name of the Statement:

  1. It describes a behavior of an instance of a specified class. Note that it does not contain method name, because what is specified is not a method, but a behavior that has its entry point and expected result. The name simply tells that what an instance does (notifies listeners that server is unavailable) under certain circumstances (when cannot connect to its monitoring port). It's important because such description is what you can derive from thinking about responsibilities of a class, so you don't need to know any of its methods signatures or the code that's inside of the class. Hence, this is something you can come up without before implementing - you just need to know why you created this class and feed on it.
  2. The name is long. Really, really, really don't worry about it. As long as you're describing a single behavior, it's alright. I know usually people are hesitant to give long names to the Statements, because they try to apply the same rules to those names as method names in production code. Let me make it clear - these two cases are different. In case of Statements, they're not invoked by anyone besides the automatic runner applications, so they won't obfuscate any code that would need to call them. Sure, we could put all the information in the comment instead of Statement name and leave the name short, like this:

    [Fact]
    //Notifies listeners that server 
    //is unavailable when cannot connect
    //to its monitoring port
    public void Statement_002()
    {
     //...
    }
    

    There are two downsides of this solution: one is that we now have to put extra information (Statement_002) which is required only by compiler, because every method needs to have a name - there's usually no value a human could derive from such a name. The second downside is that when the Statement is evaluated to false, the automated runner shows you the following line: Statement_002: FAILED - note that all the information included in the comment isn't present in the failure report. It's really better to receive a report such as: notifiesListenersThatServerIsUnavailableWhenCannotConnectToItsMonitoringPort: FAILED, because all the information about the Statement that fails is present in the runner window.

  3. Using a name that describes a (single) behavior allows you to track quickly why the Statement is false when it is. Suppose a Statement is true when you start refactoring, but in the meantime it turns out to be false and the report in the runner looks like this: TrySendingHttpRequest: FAILED - it doesn't really tell you anything more than an attempt is made to send a HTTP request, but, for instance, does not tell you whether your specified object is the sender (that should try to send this request under some circumstances) or the receiver (that should handle such request properly). To actually know what went wrong, you have to go to the code to scan its source code. Now compare it to the following name: ShouldRespondWithAnAckWheneverItReceivedAHttpRequest. Now when it evaluates to false, you can tell that what is broken is that the object no longer responds with an ACK to HTTP request. Sometimes this is enough to deduct which part of the code is in fault of this evaluation failure.

My favourite convention

There are many conventions for naming Statements appropriately. My favorite is the one developed by Dan North, which makes each Statement name begin with the word Should. So for example, I'd name a Statement: ShouldReportAllErrorsSortedAlphabeticallyWhenItEncountersErrorsDuringSearch(). The name of the Specification (i.e. class name) answers the question "who should do it?", i.e. when I have a class named SortingOperation and want to say that it "should sort all items in ascending order when performed", I say it like this:

public class SortingOperationSpecification
{
 [Fact] public void
 ShouldSortAllItemsInAscendingOrderWhenPerformed()
 {
 }
}

It's important to focus on what result is expected from an object when writing names along this convention. If you don't, you quickly find it troublesome. As an example, one of my colleague was specifying a class UserId and wrote the following name for the Statement about comparison of two identifiers: EqualOperationShouldPassForTwoInstancesWithTheSameUserName(). Note that this is not from the perspective of a single object, but rather from the perspective of an operation that's executed on it, which means that we stopped thinking in terms of object responsibilities and started thinking in terms of operation correctness, which is farther away from our assumption that we're writing a Specification consisting o Statements. This name should be changed to: ShouldReportThatItIsEqualToAnotherObjectWhenItHasTheSameUserName().

And this is the end of part 1 of the series. In the next installment, we'll take a look at another technique - brute force translation of prose description written with GIVEN-WHEN-THEN structure to code.

No comments: