Unit, integration, subcutaneous, UI, fast, slow, mocks, TDD, isolation and scams… What is this? I don’t even!

As outlined in the first post of my Automated Testing blog series I’ve been on a journey of self reflection and discovery about how best to write, structure and maintain automated tests.

The most confusing and profound realisations that I’ve had relate to how best to cover a codebase in tests and what type and speed those tests should be. The sorts of questions and considerations that come to mind about this are:

  • Should I be writing unit, subcutaneous, integration, etc. tests to cover a particular piece of code?
  • What is a unit test anyway? Everyone seems to have a different definition!
  • How do I get feedback as fast as possible – reducing feedback loops is incredibly important.
  • How much time/effort are we prepared to spend testing our software and what level of coverage do we need in return?
  • How do I keep my tests maintainable and how do I reduce the number of tests that break when I need to make a change to the codebase?
  • How do I make sure that my tests give me the maximum confidence that when the code is shipped to production it will work?
  • When should I be mocking the database, filesystem etc.
  • How do I ensure that my application is tested consistently?

In order to answer these questions and more I’ve watched a range of videos and read a number of blog posts from prominent people, spent time experimenting and reflecting on the techniques via the projects I work on (both professionally and with my Open Source Software work) and tried to draw my own conclusions.

There are some notable videos that I’ve come across that, in particular, have helped me with my learning and realisations so I’ve created a series of posts around them (and might add to it over time if I find other posts). I’ve tried to summarise the main points I found interesting from the material as well as injecting my own thoughts and experience where relevant.

There is a great talk by Gary Bernhardt called Boundaries. For completeness, it is worth looking at in relation to the topics discussed in the above articles. I don’t have much to say about this yet (I’m still getting my head around where it fits in) apart from the fact that code that maps input(s) to output(s) without side effects are obviously very easy to test and I’ve found that where I have used immutable value objects in my domain model it has made testing easier.

Summary

I will summarise my current thoughts (this might change over time) by revisiting the questions I posed above:

  • Should I be writing unit, subcutaneous, integration, etc. tests to cover a particular piece of code?
    • Typical consultant answer: it depends. In general I’d say write the fastest possible test you can that gives you the minimum required confidence and bakes in the minimum amount of implementation details.
    • I’ve had a lot of luck covering line-of-business web apps with mostly subcutaneous tests against the MVC controllers, with a smattering of unit tests to check conventions and test really complex logic and I typically see how far I can get without writing UI tests, but when I do I test high-value scenarios or complex UIs.
  • What is a unit test anyway? Everyone seems to have a different definition!
  • How do I get feedback as fast as possible – reducing feedback loops is incredibly important.
    • Follow Jimmy’s advice and focus on writing as many tests that are as fast as possible rather than worrying about whether a test is a unit test or integration test.
    • Be pragmmatic though, you might get adequate speed, but a higher level of confidence by integrating your tests with the database for instance (this has worked well for me)
  • How much time/effort are we prepared to spend testing our software and what level of coverage do we need in return?
    • I think it depends on the application – the product owner, users and business in general will all have different tolerances for risk of something going wrong. Do the minimum amount that’s needed to get the amount of confidence that is required.
    • In general I try and following the mantra of “challenge yourself to start simple then inspect and adapt” (thanks Jess for helping refine that). Start off with the simplest testing approach that will work and if you find you are spending too long writing tests or the tests don’t give you the right confidence then adjust from there.
  • How do I keep my tests maintainable and how do I reduce the number of tests that break when I need to make a change to the codebase?
    • Focus on removing implementation details from tests. Be comfortable testing multiple classes in a single test (use your production DI container!).
    • Structure the tests according to user behaviour – they are less likely to have implementation details and they form better documentation of the system.
  • How do I make sure that my tests give me the maximum confidence that when the code is shipped to production it will work?
    • Reduce the amount of mocking you use to the bare minimum – hopefully just things external to your application so that you are testing production-like code paths.
    • Subcutaneous tests are a very good middle ground between low-level implementation-focused unit tests and slow and brittle UI tests.
  • When should I be mocking the database, filesystem etc.
    • When you need the speed and are happy to forgo the lower confidence.
    • Also, if they are external to your application or not completely under your application’s control e.g. a database that is touched by multiple apps and your app doesn’t run migrations on it and control the schema.
  • How do I ensure that my application is tested consistently?
    • Come up with a testing strategy and stick with it. Adjust it over time as you learn new things though.
    • Don’t be afraid to use different styles of test as appropriate – e.g. the bulk of tests might be subcutaneous, but you might decide to write lower level unit tests for complex logic.

In closing, I wanted to show a couple of quotes that I think are relevant:

Fellow Readifarian, Kahne Raja recently said this on an internal Yammer discussion and I really identify with it:

“We should think about our test projects like we think about our solution projects. They involve complex design patterns and regular refactoring.”

Another Readifarian, Pawel Pabich , made the important point that:

“[The tests you write] depend[s] on the app you are writing. [A] CRUD Web app might require different tests than a calculator.”

I also like this quote from Kent Beck:

“I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence.”

IDDD Course notes

Last month I completed Vaughn Vernon‘s 3-day advanced IDDD Workshop. Here are some notes I’ve since written up about the main points I got out of it after re-reviewing the course material. Some notes from a colleague who attended the same course have also been published.

Getting started with DDD

  • DDD is the formation of a ubiquitous language explicitly bound by a “bounded context” – the context is important because the same word can have different meanings
  • Hexagonal architecture / ports and adapters can be used with DDD – there are adapters from the domain to a port (either input or output – could be datasources or other systems)
  • DDD should be used in situations for complex/unknown applications and problem domains that provide a competitive advantage – it is a technique to help uncover these complexities over time in a sustainable way
    • If CRUD is more appropriate – use CRUD
    • Keep in mind that investing in the code generally pays off over the long term
  • DDD can be used with legacy code, but you need to abstract and encapsulate the parts of the legacy code you aren’t modelling
  • DDD is about bridging the gap between technical people and subject matter experts so finding the subject matter experts (they aren’t necessarily your product owner and there might be multiple ones) is important – where possible interact with them directly for best effect
  • A ubiquitous language should include the terms as well as scenarios that describe the context of those “things”
  • Good general rule: Tell, don’t ask (tell objects what you want to do, don’t ask them for data)

Domains, subdomains, bounded contexts

  • Subdomains – identify the pieces of information you need for your project to be successful
    • Where possible have a one to one mapping with a bounded context unless there is a small shared kernel
  • Bounded context contains: interfaces (service or UI), app services, database schema, domain itself
  • Example: pull out user and identity into separate domain, then you can model more explicit things that might be mapped from that context e.g. Author, Collaborator (these could even just be a value object with an id that links to the other context)
  • For each domain look at what domains it (core) is linked to and identify them as: supporting or generic

Context maps / relationships between subdomains

  • It can help to draw context maps to illustrate where the subdomains, contexts and the relationships between them are
  • Ways of joining two separate (sub)domains:
    • Partnership
    • Shared kernel
    • Customer-supplier
    • Conformist
  • When communicating with another domain (via OHS, web service, messaging):
    • ACL
    • Published language
    • Event subscription

Architecture

  • Ports and adapters, {infra -> UI -> application -> domain}, CQRS/ES, event-driven are some of the architectural options
    • Ports don’t have to model the domain e.g. UserInRole REST service even though there is no UserInRole domain object)
    • Adapters can map to/from domain representation e.g. UserInRoleAdapter
    • Ports and adapters allow you to focus on the domain, delay infrastructure concerns and do separate testing
  • Domain services shouldn’t control security (where it’s a cross-cutting concern as opposed to something being modelled) or transactions

Entities / Value objects

  • Use entities to model things that you care about individuality
    • Equality via id
    • Avoid just doing sets – if you have two sets in a row then you are probably missing a behaviour – ask yourself if you missed a set would you be in an inconsistent state?
    • Ensure consistency by using invariants and non-default constructors
  • Use value objects where possible to model things that describe, measure, qualify or quantify the ubiquitous language – should model a conceptual whole
    • Generally immutable, set state via ctor – makes it easy to test and maintain
      • Modification methods should return a new instance
    • Equality by comparing properties
    • Discardable, replaceable and interchangeable
    • If you want you can use value objects to represent the id of a particular entity (e.g. wrap guid)
  • Generating identities
    • User provides
    • Application generates
    • Persistence store generates
    • Anther bounded context provides

Aggregates

  • Main purpose: determine transactional boundaries for consistency
  • Balance between having a large aggregate graph that gives navigational convenience, but can result in higher likelihood of transactional errors that don’t affect consistency / negative size and performance impacts vs small aggregates that might mean a lot of messing about in application code to wrangle multiple aggregates
  • Keep in mind eventual consistency – do multiple aggregates have to be kept consistent?
    • event processing / batch processing etc. – get business to specify the max time to be consistent
  • Where possible reference between aggregates by id – less memory, faster to load, better garbage collection
  • Can use double dispatch from an application service to pass instances between aggregates to perform actions
  • Domain events are non-async
    • This means you can respond to a domain event within the same transaction
    • You can have a listener that then publishes the events to a bus (and from there they can be handled async)
    • Who job is it to make the data consistent?
      • If the end user then use transaction consistency if another user or the system then eventual consistency should be fine

Domain Services

  • A domain service is a part of the domain model that is a non-transactional, lightweight stateless operation that doesn’t have a natural home in an entity
    • If you find yourself using static methods in your domain entities then it’s a good indicator that you need a domain service
    • They can be passed into models and used via double dispatch or the model can be passed into them

Domain Events

  • Domain events inform subscribers of the facts about past happenings in a bounded context
  • When domain experts use triggering works then it’s a good sign you might need a domain event:
    • When
    • If that happens
    • Inform me if
    • Notify me if
    • An occurrence of
  • You can persist events along with state (or just the events if using event sourcing)
  • Events can be published outside of the bounded context or processed asynchronously by forwarding them to a message exchange
    • Bus
    • Decoupled publishing e.g. atom feed
  • Event-driven modelling exercise – Trello is a good example
    • Good example to do with a domain expert as a first step to fleshing out a domain
    • Leads to a good model if you plan on using event sourcing
    • 1. Model the events in time order (verb, past tense)
    • 2. Model the commands that would create the events (imperative exhortation)
    • 3. Model the aggregates that would participate in the command/event (noun)

Modules (namespaces)

  • Name the modules as per the ubiquitous language (e.g. aggregates or concepts to group an aggregate)
  • Can have child namespaces e.g. concept -> aggregate
  • Don’t create modules based on type of component (e.g. include services with the entities)
  • Try not to couple between modules and where there are dependencies try and use acyclic graphs

Factories

  • Where possible construct objects inside of domain services and entities so the ubiquitous language is expressed
  • If you have all the parameters and the construction is simple it’s ok to new up an object in an app service

Acceptance Tests Structure [Automated Testing Series]

This is part of my ongoing Automated Testing blog series:

Acceptance Tests Structure

When writing high level acceptance tests (as opposed to unit tests) I will always try to use separate methods for the Given, When and Then since usually the Given and possibly When are more complex so there might need to be multiple methods for each. My favourite framework for writing tests with this structure by far is Bddfy (disclaimer: I am a core contributor of the TestStack organisation). When I say high-level acceptance tests I usually refer to automated UI / full system tests, but it’s possible they could also be complex subcutaneous tests (where subcutaneous tests fit in is part of my thinking I’m not quite sure on yet).

I’ve said before and I still maintain that consistency is the most important aspect when it comes to keeping a software application maintainable so I think that within a particular set of tests if you are writing some that are single method Arrange, Act, Assert tests then you shouldn’t mix those tests with something like Bddfy since it’s wildly different. I feel that the techniques I described for structuring tests using test-per-class in the last post in the series is OK to mix with AAA tests though as I discussed in that post.

The above two paragraphs have led me to the following thoughts:

  • I keep my high-level acceptance tests in a separate project from my unit/integration/etc. tests since they are:
    • Inconsistently specified as discussed above
    • As seen below the way I structure the tests into namespaces / folders and the way I name the test class is very different too
    • They have a different purpose and intent – one is to check your implementation is sound / help design the technical implementation and the other is to specify/check the business requirements of the system i.e. the concept of an executable specification
  • If possible use something like Bddfy and a Specification base class (see below) that allows you to specify the implementation of your scenario
    • Yes I know about SpecFlow, but I don’t think that the maintenance overhead is worth it unless you actually have your product owner writing the specifications, but by all accounts I’ve heard (and based on my experiences) it’s tricky for them to get it right and the developers end up writing the scenarios anyway – do yourself a favour and use a framework that is built for developers and get the devs to sit with the product owner to write the test – that’s what they are good at!
    • One of the many cool things about Bddfy is its reporting functionality – out of the box it generates a HTML report, but it’s also flexible enough to allow you to define your own output; I think this fits in nicely with the idea of an executable specification
  • I’ve used the base class shown below to make it really easy to define Bddfy tests (then you simply need to specify methods according to the Bddfy conventions and it will automatically pick them up
    • If you want to reuse Given’s, When’s or Then’s then simply use inheritance (e.g. you might define an AuthenticationUserSpecification that extends Specification and provides a GivenUserIsAuthenticated method)
    • If you need to use a data-driven test then see below for an example of how to do it

Basic Specification base class

You can add any setup / teardown that is required to this for your purposes or wrap the run method code if needed (e.g. for automated UI tests I catch any exception from the this.BDDfy() call and take a screenshot).

public abstract class Specification
{
    [Test]
    public virtual void Run()
    {
        this.BDDfy();
    }
}

For a more advanced example of this base class including a generic version that identifies the SUT and provides auto-mocking (in this case the tests are unit tests rather than acceptance tests) check out the TestStack.Seleno tests.

Example of extracting common parts of scenarios

Here is an example of how you might pull out common logic into a customised specification base class. In this case it is demonstrating what it might look like to have a base class for when the scenario starts with the user not being logged in when using the TestStack.Seleno library.

public abstract class UnauthenticatedSpecification : Specification
{
    protected MyAppPage InitialPage;

    public void GivenNotLoggedIn()
    {
        InitialPage = Host.Instance
            .NavigateToInitialPage<MyAppPage>()
            .Logout();
    }
}

Example test class

Here is an example of what an actual test class might look like. In this case it is a test that ensures a user can register for an account on the site and extends the example UnauthenticatedSpecification above. There is some code missing here about how the page objects are implemented, but it’s fairly readable so you should be able to follow what’s happening:

public class ValidRegistration : UnauthenticatedSpecification
{
    private RegistrationViewModel _model;
    private MyAppPage _page;
    private Member _member;

    public void AndGivenValidSubmissionData()
    {
        _model = GetValidViewModel();
    }

    public void AndGivenUserHasntAlreadySignedUp()
    {
        // In this instance the Specification base class has helpfully exposed an NHibernate Session object,
        //  which is connected to the same database as the application being tested
        Session.Delete(Session.Query<Member>().SingleOrDefault(m => m.Email == _model.Email));
    }

    public void WhenVisitingTheJoinPageFillingInTheFormAndSubmitting()
    {
        _page = InitialPage
            .Register()
            .Complete(_model);
    }

    public void ThenTheMemberWasCreated()
    {
        _member = Session.Query<Member>()
            .Single(m => m.Email == _model.Email);
    }

    public void AndTheMembersNameIsCorrect()
    {
        _member.FirstName.ShouldBe(_model.FirstName);
        _member.LastName.ShouldBe(_model.LastName);
    }

    public void AndTheMemberCanLogIn()
    {
        _page.Login()
            .Successfully(_model.Email, _model.Password);
    }

    private RegistrationViewModel GetValidViewModel()
    {
        // This is using the NBuilder library, which automatically populates public properties with values
        // I'm overriding the email and password to make them valid for the form submission
        return Builder<RegistrationViewModel>.CreateNew()
            .With(x => x.Email = "email@domain.tld")
            .With(x => x.Password = "P@ssword!")
            .Build();
    }
}

Demonstration of data-driven test

I mentioned above that it was possible to have a data-driven test with the Specification base class. I can do this by using a hack that Jake Ginnivan showed me when we were creating acceptance tests for GitHubFlowVersion for XUnit and a technique I have found works for NUnit. I’ve put a full code example in a gist, but see below for an abridged sample.

I should note I’ve only tried these techniques with the ReSharper test runner so it’s possible they function differently with other runners. Another option would be to not inherit the base class for the situations where you need data-driven tests, but I typically find that I would then loose a bunch of useful setup/functionality when doing that; YMMV.

NUnit

You might have noticed I set the Run method in the Specification base class to virtual. If you do this then you can do the following in your test class and it will ignore the test method in the base class and use the data-driven one in this class – it does show the base test up as being ignored though – you could also assign a category to the test and not run that category:

    [Test]
    [TestCase(1)]
    [TestCase(2)]
    [TestCase(5)]
    public void RunSpecification(int someValue)
    {
        _privatePropertyTheTestMethodsUse = someValue;
        base.Run();
    }

    [Ignore]
    protected override void Run() { }

XUnit

You don’t have to have the base Run method virtual for this one (but it doesn’t hurt if it is either). Also, the test from the base class comes up as Inconclusive in the ReSharper test runner (just ignore it):

    [Theory]
    [InlineData(1)]
    [InlineData(2)]
    [InlineData(5)]
    public void RunSpecification(int someValue)
    {
        _privatePropertyTheTestMethodsUse = someValue;
        base.Run();
    }

    public new void Run() {}

Namespaces / folder structure

When structuring the acceptance test project I like to have a folder/namespace called Specifications and underneath that a folder for each feature being tested. Then there can be a file within each feature for each scenario being tested (I generally name the scenario with a short name to easily identify it).

So, for instance the valid registration test above might be found at Specifications.Registration.ValidRegistration.

General Test Structure [Automated Testing Series]

This is part of my ongoing Automated Testing blog series:

General Test Structure

Recently I found myself experimenting with demon-coder, fellow Readifarian, and all-round-nice-guy Ben Scott (o/ – that’s for Ben; he will understand) with trying to extract shared Given, When and/or Then sections of tests by using a pattern whereby the Given and possibly When of a test happens in the test setup (or constructor if using XUnit). This allows each assertion (or Then) that you want to make to be in a separate test method and avoids both the problem of multiple somewhat/completely unrelated asserts in a single test and massive duplication of test setup.

I realise there are other ways of dealing with the duplication such as abstracting common logic into private methods (this is the technique I have used in the past), but I’ve found the above solution to be much nicer / cleaner and clearer in intent. It’s also worth noting that there are frameworks that directly support testing with the Given, When, Then structure and I cover that further in a later post in this series (as well as why I don’t just use them all the time).

I’ve created a set of code samples to illustrate the different techniques at https://gist.github.com/robdmoore/8634204:

  • 01_Implementation.cs contains an example mapper class to be tested – it is mapping from some sort of measurement of a value that has been taken (say, from an API or a hardware device) with a confidence and some kind of identifier to an object that represents the identity (broken down into component parts), the measurement value and an adjusted value (based on the confidence). It’s a fairly contrived example, but hopefully it’s understandable enough that it illustrates my point.
  • 02_MultipleAsserts.cs shows a single Arrange Act Assert style test that contains multiple asserts for each of the mapped properties and has description strings for each assert so that you can tell which assert was the one that failed.
  • 03_RelatedMultipleAsserts.cs partially addresses the main problem in the first test (multiple somewhat unrelated asserts) by grouping the related asserts together – this introduces repetition in the tests though.
  • 04_AbstractedCommonCode reduces the duplication by abstracting out the common logic into private methods and provides a separate test for each assert – because of this the assertions don’t need a description (since there is one assert per test), but the test class is a lot more verbose and there is duplication both across test names and test content.
  • 05_GivenAndWhenInSetup.cs demonstrates the technique I describe above; in this case the constructor of the class (it also could have been a [SetUp] method) contains both the Arrange/Given and Act/When part of the test and each test method is both very slim (containing just the assertion) and is named with just the Then.
    • In NUnit you can use [SetUp] and in XUnit the constructor, but it’s worth noting that this means that code will run for every test (which is usually fine), however if the code you are testing takes a reasonable amount of time to run then you can use a constructor or [TestFixtureSetUp] in NUnit or a static constructor (if you need to also use the constructor (say, to perform the When or allow for slightly different behaviour across inherited classes when using inheritance) since the fixtures get injected after the constructor is called) or IUseFixture<TFixture> in XUnit.
    • It’s worth noting this class is still a lot bigger than the single AAA test , but for a situation like this I feel it’s a lot more understandable and better encapsulates the behaviour being tested; this technique also comes into it’s own when there is duplication of Given and/or When logic across multiple AAA tests (you can use inheritance to reuse or extend the Given/When code)
  • 06_ConcernForBaseClass.cs shows a variation of the above strategy that provides a common base class that can be used in tests to help make the Given and When code more explicit as well as identifying what the subject under test is.

Another technique to reduce the setup of tests is to use AutoFixture’s auto data attributes (XUnit and NUnit are supported) to inject any variables that you need into the test. I’m not going to cover that in this post at this stage because a) I haven’t used it in anger and b) I suspect it’s too complex for teams that aren’t all experienced (i.e. it’s a bit too “magic”). It is very cool though so I highly encourage you to at least evaluate it as an option.

There are a number of things to note about what I’ve been doing with this technique so far:

  • We still managed to keep the convention of having a test described by Given (where appropriate; sometimes you don’t need the Given), When and Then by ensuring that the combination of the class name and the test name cater for them
    • e.g. a class called GivenX_WhenY or just WhenY with a bunch of test methods called ThenZ or class called GivenX with a bunch of test methods called WhenY_ThenZ
  • You can reuse and extend common parts of your test logic by using inheritance e.g. define a base class with common Given and/or When and extend it for different variations of the Given and either reuse Then’s from the base class and/or define new ones for each inheriting class
    • I’ve seen some people use this technique with 5 levels deep of nested classes with one line of code being reused in some of the hierarchy; while I’m a massive fan of DRY I think that there is a point where you can go overboard and in the process lose understandability – 5 level deep nested test classes is in my opinion completely overboard
  • An interesting side-effect of this approach is that rather than having one test class per thing being tested you end up with multiple classes
    • I have always followed a convention of putting a test file in the same namespace as the class being tested (except in the Test project and post-fixing with Tests) e.g. MyApp.Commands.CreateInvoiceCommand would be tested by MyApp.Tests.Commands.CreateInvoiceCommandTests, but that doesn’t work when you have multiple test files so what I’ve taken to doing is making a folder at MyApp.Tests\Commands\CreateInvoiceCommandTests, which contains all the test classes
    • This allows me to mix and match folders (where needed) and single test classes (sometimes a single class is enough so I don’t bother creating a folder) while keeping it clear and consistent how and where to find the tests
  • Not all tests need to be written in this way – sometimes a simple, single AAA test (or a bunch of them) is enough
    • As you can see from the above code samples the AAA tests tend to be terser and quicker to write so when there isn’t much duplicated logic across tests and you only need one assertion or the assertions are one logical assertion (i.e. belong in a single test) then there is no reason to stray from single AAA tests in my opinion
    • I don’t feel that a combination of AAA tests and the common setup tests cause a consistency issue because it’s fairly easy to trace what’s happening and the common setup logic falls within the normal bounds of what you might have for AAA tests anyway
    • I find it’s easier to create data-driven tests using single method tests because you have so much more flexibility to do that using NUnit and XUnit – if you find a good way to do it with test-per-class rather than test-per-method let me know!
      • NUnit’s TestFixture attribute goes some of the way, but it’s not as powerful as something like TestCase orTestCaseSource
  • There is a base class (ConcernFor) that we were using for some of the tests that I included in the last code sample above

Test Naming [Automated Testing Series]

For the last 6 months I’ve been thinking and reading a lot about how best to write automated tests for applications – including data generation, structure, naming, etc. This blog series is a foray into my current thinking (which will probably change over time; for instance I’m looking back at tests I wrote 6 months ago that I thought were awesome and now thinking that I hate them :P). I’m inviting you to join me on this journey by publishing this series and I encourage thoughtful comments to provoke further thinking and discussion.

Test Naming

I have always been a fan of the Arrange Act Assert structure within a test and started by naming my test classes like <ThingBeingTested>Should with each test method named like A_statement_indicating_what_the_thing_being_tested_should_do. You can see examples of this type of approach in some of my older repositories such as TestStack.FluentMVCTesting.

More recently I’ve started naming my tests with a Given, When, Then description (be it an acceptance test or a unit test). I like the Given, When, Then method naming because:

  • Generally the Given will map directly to the Arrange, The When will map directly to the Act and the Then will map directly to the Assert – this provides a way of quickly cross checking that a test does what the name suggests it should by comparing the name to the implementation
    • I find this really useful when reviewing pull requests to quickly understand that the author of a test didn’t make a mistake and structured their test correctly
    • e.g. it’s probably wrong if there is a Given in the name, but no clear Arrange section
  • It requires that you put more thought into the test name, which forces you to start by thinking about what scenario you are trying to test rather than “phoning in” an arbitrary test name and diving into the implementation and focussing on how you are testing it
    • My gut feel along with anecdotal evidence from the last few projects I’ve worked on – with developers of varying skill levels – suggests this leads to better written, more understandable tests
    • Side note: I really like the idea discussed by Grzegorz Gałęzowski in his TDD e-book about the order in which you should write a TDD test (from the section “Start From The End”) – he suggests you start with the Then section and then work backwards from there
      • I can’t honestly say I usually do this though, but rather just that it sounds like a good idea; generally I know exactly what I want to write when writing a test (yes, it’s possible I’m just naive and/or arrogant ;))

To clarify, I still use Arrange, Act, Assert in my tests. I could write a whole blog post on how I structure the AAA sections in my tests, but luckily I don’t have to because Mark Seemann has already written a really good post that mimics my thoughts exactly.

Using Pull Requests for commercial/private/proprietary development

Developers who are familiar with open source will likely be aware of pull requests, which were created and popularised by GitHub as a way of providing some automation, visibility and social interaction around merging software changes. It replaces the old school notion of emailing patch files to each other as well as providing more visibility and interaction over pushing to the same branch (aka mainline development).

This is a post I’ve been meaning to write for a while. As a Senior Consultant for Readify I find myself spending a lot of time mentoring teams and as a part of that I’m constantly paying attention to trends I notice and picking up / experimenting with techniques and processes that I can introduce to teams to get positive outcomes (improved efficiency, quality, maintainability, collaboration etc.). In my experiences so far I’ve found that the single most effective change I have introduced to a software team that has improved the quality of software output is to introduce pull requests. It also has a positive effect on other things like improved collaboration and collective code ownership.

I can’t claim to have invented this idea – I got the idea from listening to the experiences that my fellow consultants Graeme Foster and Jake Ginnivan had and there are certainly examples of companies using it, not the least of which is GitHub itself.

What are pull requests?

Essentially, a pull request allows a developer to make some commits, push those commits to a branch that they have access to and create a request to a repository that they don’t necessarily have access (but also could) to have their commit(s) merged in. Depending on the software that you are using, a pull request will generally be represented as a diff, a list of the commits and a dialogue of comments both against the pull request itself and against individual lines of code in the diff.

Why are pull requests useful for open source?

Pull Requests are awesome for open source because they allow random third parties to submit code changes easily and effectively to projects that they don’t usually have commit access to and for project maintainers to have a conversation with the contributor to get the changes in a state where they can be merged.

This lowers the barrier of entry to both maintainers and contributors, thus making it easier for everyone involved :).

Why are pull requests useful for commercial development?

On first thought you might think that commercial development is significantly different from open source since the people submitting code changes are on the same team and there will be existing code review processes in place to deal with code quality.

Based on my experiences and other people I know there are definitely a range of advantages to using pull requests for commercial development (in no particular order):

  • If you have enough developers and a big enough project/product then you might actually have a similar setup to an open source project in that the product might have a “core team” that look after the product and maintaining standards and consistency and other developers in the company can then submit pull requests to be reviewed by the core team
  • Pull requests give a level playing field to the whole team – it encourages more junior or shy developers to “safely” review and comment on commits by more senior developers since it takes away the intimidation of doing it in person
  • It provides a platform for new developers to get up-skilled quicker by providing:
    • Easy to digest, focussed examples of how to implement certain features that they can browse easily without having to ask how to find them
    • Confidence to raise pull requests without the stress of needing to know if the code is OK
    • Line-by-line comments to help them identify what they need to change to get their code “up to scratch”
    • This has a positive effect on the new developer themselves as well as the team since the learning is founded in practical application (which in my view is always the most efficient way to learn) and the person potentially needs less time from the team to get up to speed (thus having a smaller “burden”)
  • In much the same way as it helps with up-skilling it helps with learning for more junior developers
  • It improves the safety of making changes to the codebase by ensuring that changes are looked at by at least one person before being merged into mainline
  • It improves the consistency/maintainability and quality of the codebase by ensuring all changes are reviewed by at least one person
  • It makes it more likely that changes that would normally pass code review because they were minor enough that it didn’t warrant rejecting a code review / going back and fixing what was already there are more likely to get fixed before being merged in
  • There is tooling out there (e.g. TeamCity’s pull request integration) that can integrate with pull requests to provide assurances that the pull request is OK to merge (and potentially automatically merge it for you)
  • The ability to have threaded comments on individual lines of code makes it really easy to have a contextual conversation about parts of the code and arrive at better designs because of it
  • Despite what code review processes you might have in place I suspect that most of the time it’s more likely code will get reviewed using pull requests because in my experience code reviews don’t always happen regardless of what development processes exist, whereas a pull request would always be reviewed before merging
  • Pull requests are pull-based rather than push-based i.e. the reviewer can review the code when they are not in the middle of something so there is less context-switching overhead (note: important/blocking pull requests might be a reason to interrupt someone still, but in my experience those pull requests are not the majority)
  • If you are working with developers across multiple timezones then the above point means that code reviews are easier to perform since they are pull-based
  • You can use pull requests to raise work-in-progress for early feedback (reducing feedback cycles :)) or for putting up the result of spikes for team comment
  • If approached in the right way then it helps teams with collective code ownership because there is a lot of back-and-forth collaboration over the code and everyone in the team is free to review any pull request they are interested in

As you can see – it’s a pretty convincing list of points and this summarises pretty well why pull requests are the most effective change I’ve made to a development team that has affected code quality.

So, nothings perfect; what are the downsides/gotchas?

There are only a few I’ve noticed so far:

  • If there is a lot of churn on the codebase from a lot of disparate teams/developers with potentially conflicting, high-priority tasks and there is no “core team” looking after the project then there is a tendency for developers to:
    • Work with other developers in their team to quickly merge the pull request without much inspection
    • Worse still – developers might just merge their own pull requests or bypass them completely and push directly to the branch
    • Acknowledge the review comments, but claim that their work is too high priority to not merge straight away and never get around to making the changes
  • If there is someone in the team that is particularly picky then it’s possible that team members might try and avoid that person reviewing their pull requests by asking someone else to review it
  • If there are a lot of changes to be made based on review comments it can sometimes get tricky to keep track of which ones have and haven’t been applied and/or the pull request could drag on for a long time while the developer completes the changes – there is a balance here between merging in the potentially important work in the pull request and spending time fixing potentially minor problems
  • If pull requests are too big they are harder to review and there will be less review comments as a result and things might get missed (this would be a problem without pull requests, but pull requests simply highlight the problem)
  • The diff UI on GitHub and Bitbucket frankly suck compared to local diffing tools. Stash’s pull request diff UI is actually really good though; this does make it a bit more difficult for bigger PRs (but then it’s still easy to pull and review the changes locally so you can get around this limitation)
  • Comments on controversial code changes (or sometimes not so controversial changes) can sometimes get a bit out of hand with more comments than lines of code changed in the PR – at this point it’s clear that the back-and-forth commenting has taken the place of what should be an in-person (or if you are remote video) conversation next to a whiteboard
  • You end up with a lot of local and remote branches that clutter things up, but there are techniques to deal with it

These behaviours are generally unhealthy and are likely to be a problem if you don’t have pull requests anyway (and in fact, pull requests simply serve to visualise the problems!) and should be addressed as part of the general continuous improvement that your team should be undergoing (e.g. raise the problems in retrospectives or to management if serious enough). Hopefully, you are in a situation with a self-organising team that will generally do the right things and get on board with the change :).

Pull requests and continuous delivery

I’m extremelly passionate about Continuous Delivery and as such I have been in favour of a mainline development approach over feature branches for quite a while. Thus, on first hearing about using pull requests I was initially wary because I mistakenly thought it was a manifestation of feature branches. What I’ve since learnt is it definitely doesn’t have to be, and in my opinion shouldn’t be, about creating feature branches.

Instead, the same mindset that you bring to mainline development should be taken to pull requests and the same techniques to enable merging of partially complete features in a way that keeps the code both constantly production-ready and continuously integrated should be used. The way I see it developers need to have a mindset of looking at a pull request as being “a set of changes I want to merge to master” rather than “a complete feature ready to be integrated”. This helps keep pull requests small, easy to review and ensures all developers are regularly integrating all changes against the mainline.

Introducing pull requests to the deployment pipeline adds another step that increases the confidence in your production deployments (the whole point of a deployment pipeline) as well as helping make sure that code in the mainline is always production-ready since the reviewer can (and should) review the changeset for it’s likely effect on production as well as the usual things you would review.

Side note: I love this git branching model description.

Tips

  • Keep PRs small (I think I’ve explained why above well enough)
  • Keep PRs focussed – a PR should contain one logical set of changes – it makes it easier to review and means if there is a problem with one set of changes that it doesn’t block the other set of changes
  • Delete a remote branch after it has been merged (if using Stash/Bitbucket then always tick the “close this branch after PR is merged” box)
  • Keep commits clean – having a PR isn’t an excuse to be lazy about your commits; use good commit messages, keep related logical changes together in a single commit, squash them together where relevant, use rebasing rather than merge commits and think about the history after the PR is merged – don’t make it harder for yourself or your team in the future
  • Don’t let PRs get stagnant – if you are finished with a task take a look at the PRs and merge a couple before moving on; if everyone does this then there will be a constant flow of merges/integration – if a PR is open for more than a day there is a problem
  • Where possible have one PR/branch per-person – you should be able to clean up your commits and force push to your pull request branch and it encourages the PR to be smaller
    • Be incredibly careful when doing a force push that you are not pushing to master
    • If you can turn off the ability to force push to master (e.g. Bitbucket allows this) then do it – better safe than sorry
  • For those situations where you do need to work closely with someone else and feel like you need a PR across multiple people there are two techniques you can use:
    • Identify and pair together on at least the blocking “integration” work, put up a PR, get it merged and then work in individual PRs after that
    • If necessary then put up a pull request and contribute to the branch of that pull request by individual pull requests (i.e. create a featurex branch with a pull request to master and then create separate featurex-subfeaturea, featurex-subfeatureb branches with pull requests to featurex) – try and avoid this because it increases the size and length of time of the integrated pull request – also, consider rebasing the integrated pull request just before it’s pulled in
    • If you plan on submitting a pull request against someone else’s pull request branch then make sure you tell them so they don’t rebase the commits on you (can you see why I try and avoid this? :P)
  • Make sure that PR merges are done with actual merge commits (the merge button in all of the PR tools I’ve used do this for you) that reference the PR number – it makes it easier to trace them later and provides some extra information about the commits that you can look up (this is the only thing you should use merge commits for in my opinion)
  • If your PR is not urgent then leave it there for someone to merge at their leisure and move on, if it’s urgent/blocking then ask the team if someone can review/merge it straight away (from what I’ve seen this is not the common case – e.g. one or two a day out of 10-15 PRs)
  • Tag your PR title with [WIP] to mark a work-in-progress PR that you are putting up for feedback to make sure you are on the right track and to reduce feedback cycles – [WIP] indicates that it’s open for people to review, but NOT to merge
  • In those rare circumstances when a PR is ready to be merged, but isn’t production-ready (maybe it’s a code clean-up waiting for a successful production release to occur first to render the code being deleted redundant for instance) then tag the PR title with [DO NOT MERGE]
  • Unless you know what you are doing with Git never work off master locally; when starting new work switch to master, pull from origin/master, and create and switch to a new branch to describe your impending PR – it makes things a lot easier
    • I almost never do this – I generally just work on master unless editing a previously submitted PR because I find it quicker, but it does require me to constant be very careful about what branches I’m pushing and pulling from!
  • Call out the good things in the code when reviewing – reviews shouldn’t devolve into just pointing out things you think are wrong otherwise they develop negative connotations – if the PR creator did a good job on a bit of complex code then give them a pat on the back!
  • Similarly, avoid emotive phrases like “ugly”, “gross” etc. when describing code; when I’m reviewing code a technique I use to try and not sound condescending is to ask a question, e.g. “Do you think this should be named XX to make it clearer?” – make the pull request into a two-way conversation; work out the best solution collaboratively as a team
  • If you aren’t sure about something don’t be afraid to ask a question; quite often this will highlight something that the author missed or at the very least help you to learn something
  • Have fun! Use funny gifs, use funny branch names – just because we are developing serious software doesn’t mean we can’t enjoy our jobs!

Getting started

I’ve found that the following helps immensely when first getting a team started (but after a while the team is able to sustain itself easily):

  • Having at least one person experienced with Git is really helpful as inevitably people will get themselves confused when first using branches and in particular with rebasing
  • Have at least one person that knows what they are doing ensure that in the first week all PRs are reviewed and merged quickly – this will alleviate the team from needing to worry about keeping on top of merging PRs and instead will concentrate on the semantics of creating pull requests, responding to review comments and updating PRs. Once they get the hang of it then stop reviewing/merging all, but the most important PRs and the team will start to notice there is an increase in PRs and will hopefully self-organise to get them merged and get into the habit of doing that
    • It’s possible this technique might not always work, but it has when I’ve tried it and it assists the team to slowly get the hang of using pull requests rather than dropping a huge change on them all at once
    • If the team doesn’t self-organise then bring up the fact PRs aren’t getting merged in retrospectives and suggest a possible solution of reviewing open PRs directly after standup and team members volunteer to review/merge each of the open ones
  • Developers will naturally approach pull requests like feature branches even if told up front to treat it the same as what you would originally push to mainline with mainline development because of the natural human desire to get things “right” / finished
    • This will lead to enormous PRs that are constantly conflicted and a real pain in the ass for the developer
    • Let this happen and when the PR is finally merged sit down with them and ask them how they think it could be avoided in the future
    • Hopefully they will come to the realisation the PR was too big themselves and because they have made that realisation based on experience they are more likely to keep PRs small

Tools

There are various tools that can help you use pull requests. The ones I have experience with or knowledge of are:

If you are interested in semver and you want to use pull requests (or you don’t and you are using mainline development) then I encourage you to check out the GitHubFlowVersion project by my friend Jake Ginnivan.

Presentation: Moving from Technical Agility to Strategic Agility

I recently gave a presentation with my colleague Jess Panni to the ACS WA Conference about Agile and where we see it heading in the next 5-10 years. When offered the speaking slot the requirements were that it involved data analytics in some way and that it wasn’t the same old Agile stuff that everyone has been talking about for years, but rather something a bit different. Both of those requirements suited me because it fit in perfectly with thoughts I’d been having recently about Agile and where it is heading.

Jess and I had a lot of fun preparing the talk – it’s not often we get time to sit down and chat about process, research what the industry leaders are saying and brainstorm our own thoughts and experiences in light of that research. I’m very proud of the content that we’ve managed to assemble and the way we’ve structured it.

We paid particular care to make the slide deck useful for people – there are comprehensive notes on each slide and there are a bunch of relevant references at the end for further reading.

I’ve put the slides up on GitHub if you are interested.

Test Data Generation the right way: Object Mother + Test Data Builders + NSubstitute + NBuilder

Generating test data for automated testing is an area that I have noticed can get very, very tedious when your application gets more and more complex. More importantly, it’s an area that I have noticed results in hard-to-maintain test logic. If you are a proponent of treating test code with the same importance of production code like I am, then you will want to refactor your test logic to make it more simple, maintainable and DRY.

In my last two projects I have been working with my team to come up with a good solution for this problem and I’m pretty stoked with what we’ve got so far. While I certainly had my hand in the end result, I certainly can’t claim all of the credit – far from it! Big props to my team mates Matt Kocaj and Poya Manouchehri for significant contributions to this.

Newing up objects everywhere

If you organically grow your automated test suite without thinking about how you are generating your test data then it’s likely you will end up with a lot of tests that have arrange sections full of object instantiation, and these sections are likely to get more and more complex as your object model gets complex (what if you need to have a customer that has 3 orders, each with 3 products?).

If you are diligent with refactoring your tests then you will likely start pulling out common instantiations to private methods (or similar), but even then you are likely to notice the following patterns start to emerge:

  • If you need to change the constructor of one of your classes there will be many, many instantiations using that constructor across your test project and so that refactor becomes a pain (and you are likely to do hacks like add default values that you wouldn’t otherwise when adding new parameters to make it easier).
  • There is likely to be a lot of repetition across different test classes in the objects that are created.
  • Building lists is really verbose and will likely feature a lot of repetition / hard-to-read code.
    • You can bring in libraries like NBuilder to make this easier and more readable.
    • You can’t use the best parts of NBuilder if you are not using public property setters so you can enforce domain invariants.
  • Your approach to generating data will be largely inconsistent across test classes – some will have private methods, some will do it inline, some might delegate to factories or helper classes etc. – this makes the code hard to understand and maintain.
  • It’s not immediately clear which of the parameters you are passing into the constructor are there to meet the requirements of the call and which are there because the test dictates it – this means the intent of your test is somewhat obscured

Object Mother

A good approach to start to address the problem is to implement the Object Mother pattern. This allows you to quickly and tersely create pre-canned objects and give them a descriptive name. Now instead of having complex chains of object instantiations you have one line of code (e.g. ObjectMother.CustomerWith3FilledOrders), which is a lot more descriptive and your tests become simpler as a result. This also helps ensure that your approach for data generation is consistent, and thus more maintainable.

If you just use Object Mother though, you will likely observe the following problems:

  • While your constructor calls are likely to be in one file now, there is still potentially a lot of calls so it’s still tricky to make constructor changes.
  • As soon as you have a small tweak needed for the data returned there is a tendency to simply create a new Object Mother property.
  • Because of this, the Object Mother class can quickly get unwieldy and a nightmare to maintain and understand – it becomes a god object of sorts.

Test Data Builder

An approach that seems to have stemmed from the Object Mother pattern is the application of the builder pattern to result in the Test Data Builder pattern. This involves creating a builder class responsible for creating a particular type of object via a fluent interface that keeps track of the desired state of the object being built before being asked to actually build the object.

This provides a number of advantages:

  • Your constructor calls will be in your test project once only making them easy to change.
  • You have a readable, discoverable, fluent interface to generate your objects making your tests easy to read/write/maintain.
  • You are consistent in how you generate your test data.
  • It works with objects that you don’t have public property setters for since there is an explicit build action, which can invoke the constructor with the parameters you have set.
  • It’s flexible enough to give you the ability to express and perform actions to your object after it’s constructed – e.g. specify that an order is paid (which might be a Pay() method rather than a constructor parameter; you wouldn’t expect to be able to create an order in the paid state!).
  • Because of this your builder objects provide a blueprint / documentation for how to interact with your domain objects.
  • The test data builder can contain sensible defaults for your object instantiation and thus your test only needs to specify the values that need to be arranged thus making the intent of your test very clear
  • We have added a method to our builders (.AsProxy()) that allows the object returned to be an NSubstitute substitute (a proxy / mock object if you aren’t familiar with that library) rather than a real object (with the public properties set to automatically return the values specified in the builder). See below for a code snippet.
    • This means that when we need substitutes to check whether certain methods are called or to ensure that certain method calls return predetermined values we can generate those substitutes consistently with how we generate the real objects.
    • This is very powerful and improves understandability and maintainability of the tests.
    • In general we try and avoid using substitutes because it means that it’s possible to get objects that violate domain invariants, however, on occasion it’s very useful to properly unit test a domain action when it interacts with other domain objects

Here is a (somewhat contrived) example of the .AsProxy() method described above.

public void Given10YearMembership_WhenCalculatingDiscount_ThenApply15PercentLongMembershipDiscount() {
    var member = MemberBuilder.AsProxy().Build();
    member.GetYearsOfMembership(Arg.Any()).Returns(10);

    var discount = _discountCalculator.CalculateDiscountFor(member, new DateTimeProvider());

    Assert.That(discount, Is.EqualTo(0.15));
}

Combining NBuilder and the Test Data Builders

Another cool thing we have done is to combine our test data builders with NBuilder to allow for terse, expressive generation of lists of objects while still supporting objects that don’t have public property setters. We do this by generating a list of builders with NBuilder and then iterating that list to build each builder and get the actual object. Here is an example:

var members = Builder.CreateListOfSize(4)
    .TheFirst(1).With(b => b.WithFirstName("Rob"))
    .TheNext(2).With(b => b.WithFirstName("Poya"))
    .TheNext(1).With(b => b.WithFirstName("Matt"))
    .Build()
    .Select(b => b.Build());

We have made the above code even terser, by adding some extension methods to allow for this:

var members = MemberBuilder.CreateList(4)
    .TheFirst(1).With(b => b.WithFirstName("Rob"))
    .TheNext(2).With(b => b.WithFirstName("Poya"))
    .TheNext(1).With(b => b.WithFirstName("Matt"))
    .BuildList();

Adding back Object Mother

One of the really nice things about the Object Mother pattern is that you reduce a lot of repetition on your tests by re-using pre-canned objects. It also means the tests are a lot terser since you just specify the name of the pre-canned object that you want and assuming that name describes the pre-canned object well then your test is very readable.

What we have done is combined the maintainability and flexibility that is afforded by the Test Data Builder pattern with the terseness afforded by the Object Mother by using Object Mothers that return Test Data Builders. This overcomes all of the disadvantages noted above with the Object Mother pattern.

In order to make it easy to find objects we also use a static partial class for the Object Mother – this is one of the few really nice uses I’ve seen for partial classes. Basically, we have an ObjectMother folder / namespace in our test project that contains a file for every entity we are generating that includes a static nested class within the root ObjectMother class. This ensures that all test data is gotten by first typing “ObjectMother.” – this makes it really consistent and easily accessible, while not having an unmaintainable mess of a god object.

For instance we might have (simple example):

// Helpers\ObjectMother\Customers.cs
using MyProject.Tests.Helpers.Builders;

namespace MyProject.Tests.Helpers.ObjectMother
{
    public static partial class ObjectMother
    {
        public static class Customers
        {
            public static CustomerBuilder Robert
            {
                get { return new CustomerBuilder().WithFirstName("Robert"); }
            }
            public static CustomerBuilder Matt
            {
                get { return new CustomerBuilder().WithFirstName("Matt"); }
            }
        }
    }
}

If our test requires any small tweaks to the predefined object then you don’t need to necessarily add another property to your object mother – you can use the methods on the builder, e.g. ObjectMother.Customers.Robert.WithLastName("Moore").Build();

We generally stick to using properties rather than methods in the object mothers since the builders mean that it’s not necessary to pass any parameters to get your pre-canned object out, but occasionally we do have some that have methods when there is something meaningful to configure that makes sense in the object mother e.g. ObjectMother.Members.MemberWithXProducts(int numProducts).

Making the Test Data Builders terse

One argument against this approach, particularly when you first start out your test project (or indeed if you decide to refactor it to use this technique after the fact) is that you are writing a lot of code to get it all up and running. We decided to avoid that by using a number of techniques:

  • We set any reasonable defaults in our builder constructor so new CustomerBuilder().Build() will give a reasonable object (unless for that particular type of object there are any properties that make sense to always have to specify, in which case we don’t add a default for that property).
  • We have created a base class that allows us to set/get the property values into a dictionary using a lambda expression that identifies the property whose value is being set/retrieved – this reduces the code in the builder by eliminating the need for most (sometimes there is still a need to keep state in the builder where you are storing something not expressed by one of the properties) of the private variables in the builder.
  • We only add fluent methods for the properties we are actually using in our tests that point in time – this means we don’t have dead methods lying around in the builders and initially they are very terse.
  • We have a base class that defines a lot of the common infrastructure behind defining a builder (including the ability to return a proxy object and the ability to create a list of builders using NBuilder – I’ve open sourced the code as NTestDataBuilder).

This generally means that we can be up and running with an object mother and builder for a new entity within a minute and it’s well worth doing it from the start with all our objects to keep consistency in the codebase and ensure all tests are maintainable, terse and readable from the start.

Here is an example of what I mean from the NTestDataBuilder library I have released:

class CustomerBuilder : TestDataBuilder<Customer, CustomerBuilder>
{
    public CustomerBuilder()
    {
        WithFirstName("Rob");
        WithLastName("Moore");
        WhoJoinedIn(2013);
    }

    public CustomerBuilder WithFirstName(string firstName)
    {
        Set(x => x.FirstName, firstName);
        return this;
    }

    public CustomerBuilder WithLastName(string lastName)
    {
        Set(x => x.LastName, lastName);
        return this;
    }

    public CustomerBuilder WhoJoinedIn(int yearJoined)
    {
        Set(x => x.YearJoined, yearJoined);
        return this;
    }

    protected override Customer BuildObject()
    {
        return new Customer(
            Get(x => x.FirstName),
            Get(x => x.LastName),
            Get(x => x.YearJoined)
        );
    }
}