Making Intent Clear / Derived Values [Automated Testing Series]

This is part of my ongoing Automated Testing blog series:

Making Intent Clear

I think one of the most important things when writing tests (apart from consistency) is that they are clear in intent. If you buy into the notion that tests form part of the documentation of your system then it’s really important, like all good documentation, that the tests are both readable and understandable.

I think there are a number of techniques that can help with this in various situations and there are three in particular that I will be covering in this sub-section of the blog series. I have already covered test naming and I think that has a big impact on clarity of intent.

Derived Values

There are a number of excellent blog posts by Mark Seemann (@ploeh) in his zero-friction TDD series that I have found useful in my ongoing research and one in particular that really resonated with me was the concept of derived values.

Consider the following code:

public static class StringExtensions
{
    public static string ReverseString(this string str)
    {
        return string.Join("", str.Reverse().ToArray());
    }
}

public class NaiveTest
{
    [Test]
    public void GivenAString_WhenInverting_ThenReversedStringWillBeReturned()
    {
        const string str = "a string";
        var result = str.ReverseString();
        Assert.That(result, Is.EqualTo("gnirts a"));
    }
}

public class DerivedValueTest
{
    [Test]
    public void GivenAString_WhenInverting_ThenReversedStringWillBeReturned()
    {
        const string str = "a string";
        var expectedResult = str.Reverse().ToArray();

        var result = str.ReverseString();

        Assert.That(result, Is.EqualTo(expectedResult));
    }
}

public class DataDrivenTest
{
    [Test]
    [TestCase("", "")]
    [TestCase("a", "a")]
    [TestCase("ab", "ba")]
    [TestCase("longer", "regnol")]
    [TestCase("a string with space", "ecaps htiw gnirts a")]
    [TestCase("num3rics&punctua10n!@$", "$@!n01autcnup&scir3mun")]
    public void GivenAString_WhenInverting_ThenReversedStringWillBeReturned(string input, string expectedResult)
    {
        var result = input.ReverseString();

        Assert.That(result, Is.EqualTo(expectedResult));
    }
}

This is a fairly contrived example, but it helps illustrate a few things:

  • The NaiveTest is hard to infer understanding at a glance – you can eventually reason an understanding about the relationship between the input and the output because of the name of the test in combination with common sense, but it’s not easy and thus I think it’s not a great test (it’s still clear AAA so it’s certainly not awful).
  • The DerivedValueTest is what Mark was describing – this is much better because the relationship between the input and result is very clear in the first two lines of the test and you immediately know a) what is being tested and b) how it should work.
    • Of note is that the implementation is the same as the real implementation – this could be a problem if the developer decides to simply copy the implementation into the test or vice versa
      • Interestingly, by writing the test using proper TDD it wouldn’t matter as much that the implementation was similar to the implementation because in writing the test you would see it fail in the “Red” step and at that time verify that the string being asserted in the test output was in fact the correct reverse string
      • The fact you are then relying on the developer verifying that the result being asserted was correct at the time of writing the test reminds me somewhat of the notion of approval tests (which I find myself using a fair bit to perform complex assertions that can’t easily be expressed in code, but can be easily reasoned “by eye”)
      • It occurs to me that if you were only testing a subset of some complex functionality that you would only need to include a subset of the implementation for the test
      • If you have a team that isn’t disciplined in writing their tests in a TDD fashion (or at least verifying the test is definitely correct) then this approach might make it easier to introduce incorrect tests that are a copy of the implementation and don’t actually test anything (hopefully code review would pick this up though)
    • Where possible you could try and include an alternate implementation of the code under test in your test (with a focus on the implementation being readable and understandable), but even in this case I still think the “Red” step mentioned above is important to make sure you didn’t have an error in your alternative implementation
  • The DataDrivenTest in my opinion is the better test in this case, not just because it provides better code coverage by trying multiple values (since this could easily have been done for the derived value test as well), but also because:
    • The relationship between input and output is made clear by their proximity and the fact that there are simple examples as well as more complex ones (the simple ones help the viewer immediately grok the relationship)
      • I feel that the “proximity” part is the most important bit here (assuming that you can grok the relationship)
      • I think the proximity in the DerivedValueTest is an important factor as well to help with immediate understanding
    • I suspect the edge cases in the above example could go into their own test so that the test name can more clearly reflect the edge case being tested
    • This approach won’t work for all situations – sometimes the logic being tested is complex enough that having the input and expected result side-by-side still won’t allow the reader to glean understanding about the relationship and it’s important to how show the expected result is derived
    • Be pragmatic – use the right approach in the right situation – derived values is sometimes useful and sometimes showing a series of {input -> expected result} is clearer – I’d say the main thing to be wary of is tests that simply have a value in the final assert and it’s not clear how that value was derived

There is a slight variation to the DataDrivenTest above that I sometimes come across that is also worth mentioning – complex example generation. It’s a strategy to avoid the situation described above where showing the derived value involves duplicating the implementation logic in situations where it’s really complex to work out that logic, but easy(ier) to come up with an example of the logic in action. I often find myself this technique for date logic – writing the date logic as part of the test never gives me a lot of confidence since it’s so darn complex to figure out (I hate programming dates/time logic). In these situations I like to pull up a calendar and pick some candidate examples for the logic I’m trying to implement.

A couple of examples are shown below, pulled from a codebase I work on (with some tweaks to generalise the second test so it’s non-identifying):

    // From http://www.timestampgenerator.com/1352031606/#result
    [TestCase("2012-11-04 12:20:06", 1352031606)]
    [TestCase("2012-11-03 23:59:59", 1351987199)]
    [TestCase("2012-02-29 13:00:01", 1330520401)]
    public void GivenDate_WhenConvertingToUnixTimestamp_ItShouldBeCorrect(string inputDate, int expectedTimestamp)
    {
        var date = DateTime.Parse(inputDate);

        var timestamp = date.ToUnixTimestamp();

        Assert.That(timestamp, Is.EqualTo(expectedTimestamp));
    }

    /* unix $ cal 8 2004
     *      August 2004
       Su Mo Tu We Th Fr Sa
        1  2  3  4  5  6  7
        8  9 10 11 12 13 14
       15 16 17 18 19 20 21
       22 23 24 25 26 27 28
       29 30 31
     */
    [Test]
    // Day before date during weekend
    [TestCase("2004-08-09", "2004-08-06", true)]
    // Day before date during week
    [TestCase("2004-08-10", "2004-08-09", true)]
    // Consecutive business days
    [TestCase("2004-08-09", "2004-08-09", true)]
    [TestCase("2004-08-09", "2004-08-10", true)]
    [TestCase("2004-08-09", "2004-08-11", false)]
    [TestCase("2004-08-09", "2004-08-12", false)]
    // Include Weekend
    [TestCase("2004-08-12", "2004-08-13", true)]
    [TestCase("2004-08-12", "2004-08-14", true)]
    [TestCase("2004-08-12", "2004-08-15", true)]
    [TestCase("2004-08-12", "2004-08-16", false)]
    // Start on Weekend
    [TestCase("2004-08-13", "2004-08-16", true)]
    [TestCase("2004-08-13", "2004-08-17", false)]
    public void WhenValidatingConnectionDate_ThenThereShouldBeAnErrorOnlyIfTheDateIsLessThan2BusinessDaysAway(string now, string date, bool expectError)
    {
        _model.ConnectionDate = DateTime.Parse(date);
        var dateTimeProvider = DateTimeProviderFactory.Create(DateTime.Parse(now));
        var modelState = new ModelStateDictionary();

        _model.Validate(modelState, dateTimeProvider);

        if (expectError)
            Assert.That(modelState[modelStateKey].Errors, Has.Count.GreaterThan(0));
        else
            Assert.That(modelState.ContainsKey(modelStateKey), Is.False);
    }

Props to my colleague Toby Moore for coming up with the idea of using the Unix cal command to generate calendars for pasting in comments above the examples).

In these examples, there is a cognitive load to figure out the relationship between input and expected result, but I don’t think there is a silver bullet in these cases – the test name, multiple examples and the comments above the tests (I think) help anyone maintaining the tests to figure out what is going on. Either way there would be a cognitive load to get your head around the logic since it’s really complex and in this case it’s about trying to minimise that load.

IDDD Course notes

Last month I completed Vaughn Vernon‘s 3-day advanced IDDD Workshop. Here are some notes I’ve since written up about the main points I got out of it after re-reviewing the course material. Some notes from a colleague who attended the same course have also been published.

Getting started with DDD

  • DDD is the formation of a ubiquitous language explicitly bound by a “bounded context” – the context is important because the same word can have different meanings
  • Hexagonal architecture / ports and adapters can be used with DDD – there are adapters from the domain to a port (either input or output – could be datasources or other systems)
  • DDD should be used in situations for complex/unknown applications and problem domains that provide a competitive advantage – it is a technique to help uncover these complexities over time in a sustainable way
    • If CRUD is more appropriate – use CRUD
    • Keep in mind that investing in the code generally pays off over the long term
  • DDD can be used with legacy code, but you need to abstract and encapsulate the parts of the legacy code you aren’t modelling
  • DDD is about bridging the gap between technical people and subject matter experts so finding the subject matter experts (they aren’t necessarily your product owner and there might be multiple ones) is important – where possible interact with them directly for best effect
  • A ubiquitous language should include the terms as well as scenarios that describe the context of those “things”
  • Good general rule: Tell, don’t ask (tell objects what you want to do, don’t ask them for data)

Domains, subdomains, bounded contexts

  • Subdomains – identify the pieces of information you need for your project to be successful
    • Where possible have a one to one mapping with a bounded context unless there is a small shared kernel
  • Bounded context contains: interfaces (service or UI), app services, database schema, domain itself
  • Example: pull out user and identity into separate domain, then you can model more explicit things that might be mapped from that context e.g. Author, Collaborator (these could even just be a value object with an id that links to the other context)
  • For each domain look at what domains it (core) is linked to and identify them as: supporting or generic

Context maps / relationships between subdomains

  • It can help to draw context maps to illustrate where the subdomains, contexts and the relationships between them are
  • Ways of joining two separate (sub)domains:
    • Partnership
    • Shared kernel
    • Customer-supplier
    • Conformist
  • When communicating with another domain (via OHS, web service, messaging):
    • ACL
    • Published language
    • Event subscription

Architecture

  • Ports and adapters, {infra -> UI -> application -> domain}, CQRS/ES, event-driven are some of the architectural options
    • Ports don’t have to model the domain e.g. UserInRole REST service even though there is no UserInRole domain object)
    • Adapters can map to/from domain representation e.g. UserInRoleAdapter
    • Ports and adapters allow you to focus on the domain, delay infrastructure concerns and do separate testing
  • Domain services shouldn’t control security (where it’s a cross-cutting concern as opposed to something being modelled) or transactions

Entities / Value objects

  • Use entities to model things that you care about individuality
    • Equality via id
    • Avoid just doing sets – if you have two sets in a row then you are probably missing a behaviour – ask yourself if you missed a set would you be in an inconsistent state?
    • Ensure consistency by using invariants and non-default constructors
  • Use value objects where possible to model things that describe, measure, qualify or quantify the ubiquitous language – should model a conceptual whole
    • Generally immutable, set state via ctor – makes it easy to test and maintain
      • Modification methods should return a new instance
    • Equality by comparing properties
    • Discardable, replaceable and interchangeable
    • If you want you can use value objects to represent the id of a particular entity (e.g. wrap guid)
  • Generating identities
    • User provides
    • Application generates
    • Persistence store generates
    • Anther bounded context provides

Aggregates

  • Main purpose: determine transactional boundaries for consistency
  • Balance between having a large aggregate graph that gives navigational convenience, but can result in higher likelihood of transactional errors that don’t affect consistency / negative size and performance impacts vs small aggregates that might mean a lot of messing about in application code to wrangle multiple aggregates
  • Keep in mind eventual consistency – do multiple aggregates have to be kept consistent?
    • event processing / batch processing etc. – get business to specify the max time to be consistent
  • Where possible reference between aggregates by id – less memory, faster to load, better garbage collection
  • Can use double dispatch from an application service to pass instances between aggregates to perform actions
  • Domain events are non-async
    • This means you can respond to a domain event within the same transaction
    • You can have a listener that then publishes the events to a bus (and from there they can be handled async)
    • Who job is it to make the data consistent?
      • If the end user then use transaction consistency if another user or the system then eventual consistency should be fine

Domain Services

  • A domain service is a part of the domain model that is a non-transactional, lightweight stateless operation that doesn’t have a natural home in an entity
    • If you find yourself using static methods in your domain entities then it’s a good indicator that you need a domain service
    • They can be passed into models and used via double dispatch or the model can be passed into them

Domain Events

  • Domain events inform subscribers of the facts about past happenings in a bounded context
  • When domain experts use triggering works then it’s a good sign you might need a domain event:
    • When
    • If that happens
    • Inform me if
    • Notify me if
    • An occurrence of
  • You can persist events along with state (or just the events if using event sourcing)
  • Events can be published outside of the bounded context or processed asynchronously by forwarding them to a message exchange
    • Bus
    • Decoupled publishing e.g. atom feed
  • Event-driven modelling exercise – Trello is a good example
    • Good example to do with a domain expert as a first step to fleshing out a domain
    • Leads to a good model if you plan on using event sourcing
    • 1. Model the events in time order (verb, past tense)
    • 2. Model the commands that would create the events (imperative exhortation)
    • 3. Model the aggregates that would participate in the command/event (noun)

Modules (namespaces)

  • Name the modules as per the ubiquitous language (e.g. aggregates or concepts to group an aggregate)
  • Can have child namespaces e.g. concept -> aggregate
  • Don’t create modules based on type of component (e.g. include services with the entities)
  • Try not to couple between modules and where there are dependencies try and use acyclic graphs

Factories

  • Where possible construct objects inside of domain services and entities so the ubiquitous language is expressed
  • If you have all the parameters and the construction is simple it’s ok to new up an object in an app service