Azure Resource Manager intro presentation and workshop

I attended the Azure Saturday event here in Perth last weekend. Matt and I did a basic intro presentation on Azure Resource Manager and ran an associated workshop, which we have published to our GitHub organisation.

Azure Resource Manager is one of the most important things to understand about Azure if you plan on using it since it’s the platform that underpins the provisioning and management of all resources in Azure going forward.

Azure Saturday Perth 2015 presentation

Automating Azure Resource Manager

I’ve recently been (finally) getting to speed with Azure Resource Manager (ARM). It’s the management layer that drives the new Azure Portal and also features like Resource Groups and Role-Based Access Control.

You can interact with ARM in a number of ways:

To authenticate to the ARM API you need to use an Azure AD credential. This is all well and good if you are logged into the Portal, or running a script on your computer (where a web browser login prompt to Azure AD will pop up), but when automating your API calls that’s not available.

Luckily there is a post by David Ebbo that describes how to generate a Service Principal (equivalent of the concept of an Active Directory Service Account) attached to an Azure AD application.

The only problem with this post is that there are a few manual steps and it’s quite fiddly to do (by David’s own admission). I’ve developed a PowerShell module that you can use to idempotently create a Service Principal against either an entire Azure subscription or against a specific Resource Group that you can then use to automate your ARM code.

I’ve published the code to GitHub.

In order to use it you need to:

  1. Ensure you have the Windows Azure PowerShell commandlets installed
  2. Download the Set-ARMServicePrincipalCredential.psm1 file from my GitHub repository
  3. Download the Azure Key Vault PowerShell commandlets and put the AADGraph.ps1 file next to the file from GitHub
  4. Execute the Set-ARMServicePrincipalCredential command as per the examples on GitHub

This will pop up a web browser prompt to authenticate (this will happen twice since I’m using two disjointed libraries – hopefully this will get resolved if Azure AD commandlets end up becoming integrated with the Azure Commandlets) give you the following information:

  • Tenant ID
  • Client ID
  • Password

From there you have all the information you need to authenticate your automated script with ARM.

If using PowerShell then this will look like:

    $securePassword = ConvertTo-SecureString $Password -AsPlainText -Force
    $servicePrincipalCredentials = New-Object System.Management.Automation.PSCredential ($ClientId, $securePassword)
    Add-AzureAccount -ServicePrincipal -Tenant $TenantId -Credential $servicePrincipalCredentials | Out-Null

If using ARMClient then this will look like:

    armclient spn $TenantId $ClientId $Password | Out-Null

One last note: make sure you store the password securely when automating the script, e.g. TeamCity password, Bamboo password or Octopus sensitive variable.

Testing AngularJS directives using Approval Tests

I recently had an application I was developing using AngularJS that contained a fair number of directives that were somewhat complex in that the logic that backed them was contained in services that called HTTP APIs. The intent was to provide a single JavaScript file that designers at the company I was working at could include and then build product pages using just HTML (via the directives). I needed to provide some confidence when making changes to the directives and pin down the behaviour.

As explained below, I ended up doing this via approval tests and I’ve published how I did it on GitHub.

Why I wanted to use Approval Tests

In order to test these directives I didn’t want to have to perform tedious DOM inspection code to determine if the directives did what I wanted. Most AngularJS directive testing examples you will find on the Internet tell you to do this though, including the official documentation.

Side note: in my research I stumbled across the ng-directive-testing library, which I feel is an improvement over most example code out there and if you do want to inspect the DOM as part of your testing I recommend you check it out.

This style of testing works fine for small, simple directives, but I felt would be tedious to write and fragile for my use case. Instead, I had an idea that I wanted to apply the approval tests technique.

I use this technique when I have a blob of JSON, XML, HTML, text etc. that I want to verify is what I expect and pin it down without having to write tedious assertions against every aspect of it – hence this technique fitted in perfectly with what I wanted to achieve with testing the directives.

How I did it

Given that directives need the DOM it was necessary to run the tests in a web browser. In this case I decided to do it via Karma since I was already using Node JS to uglify the JavaScript.

ApprovalTests requires access to the filesystem in order to write the approval files and then access to open processes on the computer to pop open a diff viewer if there is a difference in the output. This is not possible from the web browser. Thus, even though there is a JavaScript port of ApprovalTests (for NodeJS) I wasn’t able to use it directly in my tests.

While contemplating my options, it occured to me I could spin up a NodeJS server to run the approvals code and simply call it from the browser – it’s not much different to how Karma gets test results. After that realisation I stumbled across approvals-server – someone had already implemented it! Brilliant!

From there it was simply a matter of stitching up the code to all work together – in my case using Grunt as the Task Runner.

Example code

To that end, I have published a repository with a contrived example that demonstrates how to test a directive using Approval Tests.

The main bits to look at are:

  • gruntfile.js – contains the grunt configuration I used including my Grunt tasks for the approval server, which probably should be split into a separate file or published to npm (feel free to send me a PR)
  • app/spec/displayproducts.directive.spec.js – contains the example test in all it’s glory
  • app/test-helpers/approvals/myapp-display-products-should-output-product-information.approved.txt – the approval file for the example test
  • app/test-helpers/approvals.js – the code to get name of currently executing Jasmine 2 test and the code to send an approval to the approval server
  • app/test-helpers/heredoc.js – a heredoc function to allow for easy specification of multi-line markup
  • app/test-helpers/directives.js – the test code that compiles the directive, cleans it up for a nice diff and passes it to be verified

Notable bits

Style guide

If you are curious about why I wrote my Angular code the way I have it’s because I’m following John Papa’s AngularJS style guide, which I think is very good and greatly improves maintainability of the resulting code.

Taming karma

I managed to get the following working for Karma:

  • Watch build that runs tests whenever a file changes – see the karma:watch and dev tasks
  • Default build including tests – see the karma:myApp and default tasks
  • A build that pops up a Chrome window to allow for debugging – see the karma:debug and debugtests takss

Simultaneous approval server runs

I managed to allow for the dev task to be running while running default by including the isPortTaken code to determine if the approvals server port is already taken.

Side note: if you are using this code across multiple projects consecutively then be careful because the approval server might be running from the other project. A way to avoid this would be to change the port per project (in both gruntfile.js and approvals.js.

Improved approval performance on Windows

I found that the performance of the approvals library was very slow on Windows, but with some assistance from the maintainers I worked out what the cause was and submitted a pull request. The version in npm has been updated, but there are currently no updates to approvals-server to use it. To overcome this I have used the npm-shrinkwrap.json file to override the version of the approvals library.

Get currently running test name in Jasmine 2

I wanted the approval test output file to be automatically derived from the currently-running test name (similar to what happens on .NET). It turns out that is a lot harder to arhieve in Jasmine 2, but with some Googling/StackOverflowing I managed to get it working as per the code in the approvals.js file.

Cleaning up the output markup for a good diff

AngularJS leaves a bunch of stuff in the resulting markup such as HTML comments, superfluous attributes and class names, etc. In order to remove all of this so the approved file is clean and in order to ensure the whitespace in the output is both easy to read and the same no matter what browser is being used I apply some modifications to the markup as seen in directives.js.

Easily specifying multi-line test markup

I pulled in a heredoc function I found on StackOverflow as seen in heredoc.js and used in the example test, e.g.:

DirectiveFixture.verify(heredoc(function () {/*    
    <myapp-display-products category="car" product="car">
        <div>{{car.name}}</div>
    </myapp-display-products>
*/}));

This is much nicer than having to concatenate one stirng per line or append a \ character at the end of each line, both of which aren’t handled nicely by the IDE I’m using.

 

Announcing TestStack.Dossier library

I’m pleased to announce the addition of a (somewhat) new library to the TestStack family called TestStack.Dossier. I say somewhat new because it’s a version 2 of an existing library that I published called NTestDataBuilder. If you hadn’t already heard about that library here is the one liner (which has only changed slightly with the rename):

TestStack.Dossier provides you with the code infrastructure to easily and quickly generate test fixture data for your automated tests in a terse, readable and maintainable way using the Test Data Builder, anonymous value and equivalence class patterns.

The release of TestStack.Dossier culminates a few months of (off and on) work by myself and fellow TestStacker Michael Whelan to bring a range of enhancements. The library itself is very similar to NTestDataBuilder, but there are some minor breaking changes. I decided to reduce confusion by keeping the version number consistent between libraries so TestStack.Dossier starts at version 2.0.

So why should I upgrade to v2 anyway?

There is more to TestStack.Dossier v2 than just a name change, a lot more. I’ve taken my learnings (and frustrations) from a couple of years of usage of the library into account to add in a bunch of improvements and new features that I’m really excited about!

Side note: my original post on combining the test data builder pattern with the object mother pattern and follow-up presentation still holds very true – this combination of patterns has been invaluable and has led to terser, more readable tests that are easier to maintain. I still highly recommend this approach (I use NTestDataBuilder TestStack.Dossier for the test data builder part).

Anonymous value support

As explained in my anonymous variables post (TBW(ritten) – future proofing this post, or setting myself up for disappointment :P) in my automated testing series, the use of the anonymous variable pattern is a good pattern to use when you want to use values in your tests whose exact value isn’t significant. By including a specific value you are making it look like that value is important in some way – stealing cognitive load from the test reader while they figure out the value in fact doesn’t not matter.

This is relevant when defining a test data builder because of the initial values that you set the different parameters to by default. For instance, the example code for NTestDataBuilder on the readme had something like this:

class CustomerBuilder : TestDataBuilder<Customer, CustomerBuilder>
{
    public CustomerBuilder()
    {
        WithFirstName("Rob");
        WithLastName("Moore");
        WhoJoinedIn(2013);
    }

    public CustomerBuilder WithFirstName(string firstName)
    {
        Set(x => x.FirstName, firstName);
        return this;
    }

    ...
}

In that case the values "Rob", "Moore" and 2013 look significant on initial inspection. In reality it doesn’t matter what they are; any test where those values matter should specify them to make the intent clear.

One of the changes we have made for v2 is to automatically generate an anonymous value for each requested value (using Get) if none has been specified for it (using Set). This not only allows you to get rid of those insignificant values, but it allows you to trim down the constructor of your builder – making the builders terser and quicker to write.

Given we aren’t talking about variables but rather values I have thus named the pattern anonymous values rather than anonymous variables.

There are a number of default conventions that are followed to determine what value to use via the new Anonymous Value Fixture class. This works through the application of anonymous value suppliers – which are processed in order to determine if a value can be provided and if so a value is retrieved. At the time of writing the default suppliers are the following (applied in this order):

  • DefaultEmailValueSupplier – Supplies an email address for all string properties with a property name containing email
  • DefaultFirstNameValueSupplier – Supplies a first name for all string properties with a property name containing firstname (case insensitive)
  • DefaultLastNameValueSupplier – Supplies a last name for all string properties with a property name containing lastname or surname (case insensitive)
  • DefaultStringValueSupplier – Supplies the property name followed by a random GUID for all string properties
  • DefaultValueTypeValueSupplier – Supplies an AutoFixture generated value for any value types (e.g. int, double, etc.)
  • DefaultValueSupplier – Supplies default(T)

This gets you started for the most basic of cases, but from there you have a lot of flexibility to apply your own suppliers on both a global basis (viaAnonymousValueFixture.GlobalValueSuppliers) and a local basis for each fixture instance (via fixture.LocalValueSuppliers) – you just need to implement IAnonymousValueSupplier. See the tests for examples.

Equivalence classes support

As explained in my equivalence classes and constrained non-determinism post (TBW) in my automated testing series the principle of constrained non-determinism frees you from having to worry about the fact that anonymous values can be random as long as they fall within the equivalence class of the value that is required for your test.

I think the same concept can and should be applied to test data builders. More than that, I think it enhances the ability for the test data builders to act as documentation. Having a constructor that reads like this for instance tells you something interesting about the Year property:

class CustomerBuilder : TestDataBuilder<Customer, CustomerBuilder>
{
    public CustomerBuilder()
    {
        WhoJoinedIn(Any.YearAfter2001());
    }

    ...
}

You may well use value objects that protect and describe the integrity of the data (which is great), but you can still create an equivalence class for the creation of the value object so I still think it’s relevant beyond primitives.

We have some built-in equivalence classes that you can use to get started quickly for common scenarios. At the time of writing the following are available (as extension methods of the AnonymousValueFixture class that is defined in a property called Any on the test data builder base class):

  • Any.String()
  • Any.StringMatching(string regexPattern)
  • Any.StringStartingWith(string prefix)
  • Any.StringEndingWith(string suffix)
  • Any.StringOfLength(int length)
  • Any.PositiveInteger()
  • Any.NegativeInteger()
  • Any.IntegerExcept(int[] exceptFor)
  • Any.Of<TEnum>()
  • Any.Except<TEnum>(TEnum[] except)
  • Any.EmailAddress()
  • Any.UniqueEmailAddress()
  • Any.Language()
  • Any.FemaleFirstName()
  • Any.MaleFirstName()
  • Any.FirstName()
  • Any.LastName()
  • Any.Suffix()
  • Any.Title()
  • Any.Continent()
  • Any.Country()
  • Any.CountryCode()
  • Any.Latitude()
  • Any.Longitude()

There is nothing stopping you using the anonymous value fixture outside of the test data builders – you can create a property called Any that is an instance of the AnonymousValueFixture class in any test class.

Also, you can easily create your own extension methods for the values and data that makes sense for your application. See the source code for examples to copy. A couple of notes: you have the ability to stash information in the fixture by using the dynamic Bag property and you also have an AutoFixture instance available to use via Fixture.

Side note: I feel that Dossier does some things that are not easy to do in AutoFixture, hence why I don’t “just use AutoFixture” – I see Dossier as complimentary to AutoFixture because they are trying to achieve different (albeit related) things.

A final note: I got the idea for the Any.Whatever() syntax from the TDD Toolkit by Grzegorz Gałęzowski. I really like it and I highly recommend his TDD e-book.

Return Set rather than this

This is a small, but important optimisation that allows test data builders to be that little bit terser and easier to read/write. The Set method now returns the builder instance so you can change your basic builder modification methods like in this example:

// Before
public CustomerBuilder WithLastName(string lastName)
{
    Set(x => x.LastName, lastName);
    return this;
}

// After
public CustomerBuilder WithLastName(string lastName)
{
    return Set(x => x.LastName, lastName);
}

Amazingly terse list of object generation

This is by far the part that I am most proud of. I’ve long been frustrated (relatively speaking, I thought what I had in the first version was very cool and useful) with the need for writing the lambda expressions when building a list of objects, e.g.:

var customers = CustomerBuilder.CreateListOfSize(3)
    .TheFirst(1).With(b => b.WithFirstName("Robert").WithLastName("Moore))
    .TheLast(1).With(b => b.WithEmail("matt@domain.tld"))
    .BuildList();

I always found tha the need to have the With made it a bit more verbose than I wanted (since it was basically noise) and I found that needing to write the lambda expression slowed me down. I dreamed of having a syntax that looked like this:

var customers = CustomerBuilder.CreateListOfSize(3)
    .TheFirst(1).WithFirstName("Robert").WithLastName("Moore")
    .TheLast(1).WithEmail("matt@domain.tld")
    .BuildList();

Well, one day I had a brainwave on how that may be possible and I went and implemented it. I won’t go into the details apart from saying that I used Castle Dynamic Proxy to do the magic (and let’s be honest it’s magic) and you can check out the code if interested. I’m hoping this won’t come back to bite me, because I’ll freely admit that this adds complexity to the code for creating lists; you can have an instance of a builder that isn’t an instance of a real builder, but rather a proxy object that will apply the call to part of a list of builders (see what I mean about complex)? My hope is that the simplicity and niceness of using the API outweighs the confusion / complexity and that you don’t really have to understand what’s going on under the hood if it “just works”TM.

If you don’t want to risk it that’s fine, there is still a With method that takes a lambda expression so you can freely avoid the magic.

The nice thing about this is I was able to remove NBuilder as a dependency and you no longer need to create an extension method for each builder to have a BuildList method that doesn’t require you to specify the generic types.

Why did you move to TestStack and why is it now called Dossier?

I moved the library to TestStack because it’s a logical fit – the goal that we have at TestStack is to make it easier to perform automated testing in the .NET ecosystem – that’s through and through what this library is all about.

As to why I changed the name to Dossier – most of the libraries that we have in TestStack have cool/quirky names that are relevant to what they do (e.g.Seleno, Bddfy). NTestDataBuilder is really boring so with a bit of a push from my colleagues I set about to find a better name. I found Dossier by Googling for synonyms of data and out of all the words dossier stood out as the most interesting. I then asked Google what the definition was to see if it made sense and low and behold, the definition is strangely appropriate (person, event, subject being examples of the sorts of objects I tend to build with the library):

a collection of documents about a particular person, event, or subject

Mundane stuff

The GitHub repository has been moved to https://github.com/TestStack/TestStack.Dossier/ and the previous URL will automatically redirect to that address. I have released an empty v2.0 NTestDataBuilder release to NuGet that simply includes TestStack.Dossier as a dependency so you can do anUpdate-Package on it if you want (but will then need to address the breaking changes).

If you have an existing project that you don’t want to have to change for the breaking changes then feel free to continue using NTestDataBuilder v1 – for the featureset that was in it I consider that library to be complete and there weren’t any known bugs in it. I will not be adding any changes to that library going forward though.

As usual you can grab this library from NuGet.

Whitepaper: Managing Database Schemas in a Continuous Delivery World

A whitepaper I wrote for my employer, Readify, just got published. Feel free to check it out. I’ve included the abstract below.

One of the trickier technical problems to address when moving to a continuous delivery development model is managing database schema changes. It’s much harder to to roll back or roll forward database changes compared to software changes since by definition it has state. Typically, organisations address this problem by having database administrators (DBAs) manually apply changes so they can manually correct any problems, but this has the downside of providing a bottleneck to deploying changes to production and also introduces human error as a factor.

A large part of continuous delivery involves the setup of a largely automated deployment pipeline that increases confidence and reduces risk by ensuring that software changes are deployed consistently to each environment (e.g. dev, test, prod).

To fit in with that model it’s important to automate database changes so that they are applied automatically and consistently to each environment thus increasing the likelihood of problems being found early and reducing the risk associated with database changes.

This report outlines an approach to managing database schema changes that is compatible with a continuous delivery development model, the reasons why the approach is important and some of the considerations that need to be made when taking this approach.

The approaches discussed in this document aren’t specific to continuous delivery and in fact should be considered regardless of your development model.

Review of: Jimmy Bogard – Holistic Testing

This post discusses the talk “Holistic Testing” by Jimmy Bogard, which was given in June 2013. See my introduction post to get the context behind this post and the other posts I have written in this series.

I really resonate with the points raised by Jimmy since I’ve been using a lot of similar techniques recently. In this article I outline how I’ve been using the techniques talked about by Jimmy (including code snippets for context).

Overview

In this insightful presentation Jimmy outlines the testing strategy that he tends to use for the projects he works on. He covers the level that he tests from, the proportion of the different types of tests he writes and covers the intimate technical detail about how he implements the tests. Like Ian Cooper, Jimmy likes writing his unit tests from a relatively high level in his application, specifically he said that he likes the definition of unit test to be:

“Units of behaviour, isolated from other units of behaviour”

Code coverage and shipping code

“The ultimate goal here is to ship code it’s not to write tests; tests are just a means to the end of shipping code.”

“I can have 100% code coverage and have noone use my product and I can have 0% code coverage and it’s a huge success; there is no correlation between the two things.”

Enough said.

Types of tests

Jimmy breathes a breath of fresh air when throwing away the testing pyramid (and all the conflicting definitions of unit, integration, etc. tests) in favour of a pyramid that has a small number of “slow as hell tests”, a slightly larger number of “slow” tests and a lot of “fast” tests.

This takes away the need to classify if a test is unit, integration or otherwise and focuses on the important part – how fast can you get feedback from that test. This is something that I’ve often said – there is no point in distinguishing between unit and integration tests in your project until the moment that you need to separate out tests because your feedback cycle is too slow (which will take a while in a greenfield project).

It’s worth looking at the ideas expressed by Sebastien Lambla on Vertical Slide Testing (VEST), which provides another interesting perspective in this area by turning your traditionally slow “integration” tests into fast in-memory tests. Unfortunately, the idea seems to be fairly immature and there isn’t a lot of support for this type of approach.

Mocks

Similar to the ideas expressed by Ian Cooper, Jimmy tells us not to mock internal implementation details (e.g. collaborators passed into the constructor) and indicates that he rarely uses mocks. In fact he admitted that he would rather make the process of using mocks more painful and hand rolling them to discourage their use unless it’s necessary.

Jimmy says that he creates “seams” for the things he can’t control or doesn’t own (e.g. webservices, databases, etc.) and then mocks those seams when writing his test.

The cool thing about hand-rolled mocks that I’ve found is that you can codify real-like behaviour (e.g. interactions between calls and real-looking responses) and contain that behaviour in one place (helping to form documentation about how the thing being mocked works). These days I tend to use a combination of hand-rolled mocks for some things and NSubstitute for others. I’ll generally use hand-rolled mocks when I want to codify behaviour or if I want to provide a separate API surface area to interact with the mock e.g.:

// Interface
public interface IDateTimeProvider
{
    DateTimeOffset Now();
}

// Production code
public class DateTimeProvider : IDateTimeProvider
{
    public DateTimeOffset Now()
    {
        return DateTimeOffset.UtcNow;
    }
}

// Hand-rolled mock
public class StaticDateTimeProvider : IDateTimeProvider
{
    private DateTimeOffset _now;

    public StaticDateTimeProvider()
    {
        _now = DateTimeOffset.UtcNow;
    }

    public StaticDateTimeProvider(DateTimeOffset now)
    {
        _now = now;
    }

    // This one is good for data-driven tests that take a string representation of the date
    public StaticDateTimeProvider(string now)
    {
        _now = DateTimeOffset.Parse(now);
    }

    public DateTimeOffset Now()
    {
        return _now;
    }

    public StaticDateTimeProvider SetNow(string now)
    {
        _now = DateTimeOffset.Parse(now);
        return this;
    }

    public StaticDateTimeProvider MoveTimeForward(TimeSpan amount)
    {
        _now = _now.Add(amount);
        return this;
    }
}

Container-driven unit tests

One of the most important points that Jimmy raises in his talk is that he uses his DI container to resolve dependencies in his “fast” tests. This makes a lot of sense because it allows you to:

  • Prevent implementation detail leaking into your test by resolving the component under test and all of it’s real dependencies without needing to know the dependencies
  • Mimic what happens in production
  • Easily provide mocks for those things that do need mocks without needing to know what uses those mocks

Container initialisation can be (relatively) slow so in order to ensure this cost is incurred once you can simply set up a global fixture or static instance of the initialised container.

The other consideration is how to isolate the container across test runs – if you modify a mock for instance then you don’t want that mock to be returned in the next test. Jimmy overcomes this by using child containers, which he has separately blogged about.

The other interesting thing that Jimmy does is uses an extension of AutoFixture’s AutoDataAttribute attribute to resolve parameters to his test method from the container. It’s pretty nifty and explained in more detail by Sebastian Weber.

I’ve recently used a variation of the following test fixture class (in my case using Autofac):

public static class ContainerFixture
{
    private static readonly IContainer Container;

    static ContainerFixture()
    {
        Container = ContainerConfig.CreateContainer(); // This is what my production App_Start calls
        AppDomain.CurrentDomain.DomainUnload += (sender, args) => Container.Dispose();
    }

    public static ILifetimeScope GetTestLifetimeScope(Action<ContainerBuilder> modifier = null)
    {
        return Container.BeginLifetimeScope(MatchingScopeLifetimeTags.RequestLifetimeScopeTag, cb => {
            ExternalMocks(cb);
            if (modifier != null)
                modifier(cb);
        });
    }

    private static void ExternalMocks(ContainerBuilder cb)
    {
        cb.Register(_ => new StaticDateTimeProvider(DateTimeOffset.UtcNow.AddMinutes(1)))
            .AsImplementedInterfaces()
            .AsSelf()
            .InstancePerTestRun();
        // Other overrides of externals to the application ...
    }
}

public static class RegistrationExtensions
{
    // This extension method makes the registrations in the ExternalMocks method clearer in intent - I create a HTTP request lifetime around each test since I'm using my container in a web app
    public static IRegistrationBuilder<TLimit, TActivatorData, TStyle> InstancePerTestRun
        <TLimit, TActivatorData, TStyle>(this IRegistrationBuilder<TLimit, TActivatorData, TStyle> registration,
            params object[] lifetimeScopeTags)
    {
        return registration.InstancePerRequest(lifetimeScopeTags);
    }
}

Isolating the database

Most applications that I come across will have a database of some sort. Including a database connection usually means out of process communication and this likely turns your test from “fast” to “slow” in Jimmy’s terminology. It also makes it harder to write a good test since databases are stateful and thus we need to isolate tests against each other. It’s often difficult to run tests in parallel against the same database as well.

There are a number of ways of dealing with this, which Jimmy outlined in his talk and also on his blog:

  1. Use a transaction and rollback at the end of the test. The tricky thing here is making sure that you simulate multiple requests – you need to make sure that your seeding, work and verification all happen separately otherwise your ORM caching might give you a false positive. I find this to be quite an effective strategy and it’s what I’ve used for years now in various forms.
    • One option is to use TransactionScope to transparently initiate a transaction and rollback that allows multiple database connections to connect to the database and you can have real, committed transactions that will then get rolled back. The main downsides are that you need MSDTC enabled on all dev machines and your CI server agents and you can’t run tests in parallel against the same database.
    • Another option is to initiate a single connection with a transaction and then to reuse that connection across your ORM contexts – this allows you to avoid MSDTC and run tests in parallel, but it also means you can’t use explicit transactions in your code (or to make them noops for your test code) and it’s not possible with all ORMs. I can’t claim credit for this idea – I was introduced to it by Jess Panni and Matt Davies.
    • If your ORM doesn’t support attaching multiple contexts to a single connection with an open transaction (hi NHibernate!) then another option would be to clear the cache after seeding and after work. This has the same advantages and disadvantages as the previous point.
  2. Drop/recreate the database each test run.
    • The most practical way to do this is to use some sort of in-memory variation e.g. sqlite in-memory, Raven in-memory, Effort for Entity Framework and the upcoming Entity Framework 7 in-memory provider
      • This has the advantage of working in-process and thus you might be able to make the test a “fast” test
      • This allows you to run tests in parallel and isolated from each other by wiping the database every test run
      • The downside is the database is very different from your production database and in fact might not have some features your code needs
      • Also, it might be difficult to migrate the database to the correct schema (e.g. sqlite doesn’t support ALTER statements) so you are stuck with getting your ORM to automatically generate the schema for you rather than testing your migrations
      • Additionally, it can actually be quite slow to regenerate the schema every test run as the complexity of your schema grows
  3. Delete all of the non-seed data in the database every test run – this can be quite tricky to get right without violating foreign keys, but Jimmy has some clever SQL scripts for it (in the above-linked article) and finds that it’s quite a fast option.
  4. Ensure that the data being entered by each test will necessarily be isolated from other test runs (e.g. random GUID ids etc.) – the danger here is that it can get quite complex to keep on top of this and it’s likely your tests will be fragile – I generally wouldn’t recommend this option.

I generally find that database integration tests are reasonably fast (after the initial spin-up time for EntityFramework or NHibernate). For instance, in a recent project the first database test would take 26s and subsequent tests took ~15ms for a test with an empty database query, ~30-50ms for a fairly basic query test with populated data and 100-200ms for a more complex test with a lot more database interaction.

In some cases I will write all of my behavioural tests touching the database because the value of testing against a production-like database with the real SQL being issued against the real migrations is incredibly valuable in terms of confidence. If you are using a DI container in your tests I’m sure that it would be possible to run the test suite in two different modes – one with an in-memory variant and parallelisation to get fast feedback and one with full database integration for full confidence. If you had a project that was big enough that the feedback time was getting too large investigating this type of approach is worth it – I personally haven’t found a need yet.

I’ve recently been using variations on this fixture class to set up the database integration for my tests using Entity Framework:

public class DatabaseFixture : IDisposable
{
    private readonly MyAppContext _parentContext;
    private readonly DbTransaction _transaction;

    static DatabaseFixture()
    {
        var testPath = Path.GetDirectoryName(typeof (DatabaseFixture).Assembly.CodeBase.Replace("file:///", ""));
        AppDomain.CurrentDomain.SetData("DataDirectory", testPath); // For localdb connection string that uses |DataDirectory|
        using (var migrationsContext = new MyAppContext())
        {
            migrationsContext.Database.Initialize(false); // Performs EF migrations
        }
    }

    public DatabaseFixture()
    {
        _parentContext = new MyAppContext();
        _parentContext.Database.Connection.Open(); // This could be a simple SqlConnection if using sql express, but if using localdb you need a context so that EF creates the database if it doesn't exist (thanks EF!)
        _transaction = _parentContext.Database.Connection.BeginTransaction();

        SeedDbContext = GetNewDbContext();
        WorkDbContext = GetNewDbContext();
        VerifyDbContext = GetNewDbContext();
    }

    public MyAppContext SeedDbContext { get; private set; }
    public MyAppContext WorkDbContext { get; private set; }
    public MyAppContext VerifyDbContext { get; private set; }

    private MyAppContext GetNewDbContext()
    {
        var context = new MyAppContext(_parentContext.Database.Connection);
        context.Database.UseTransaction(_transaction);
        return context;
    }

    public void Dispose()
    {
        SeedDbContext.Dispose();
        WorkDbContext.Dispose();
        VerifyDbContext.Dispose();
        _transaction.Dispose(); // Discard any inserts/updates since we didn't commit
        _parentContext.Dispose();
    }
}

Subcutaneous testing

It’s pretty well known/documentated that UI-tests are slow and unless you get them right are brittle. Most people recommend that you only test happy paths. Jimmy classifies UI tests in his “slow as hell” category and also recommends on testing (important) happy paths. I like to recommend that UI tests are used for high value scenarios (such as a user performing the primary action that makes you money), functionality that continually breaks and can’t be adequately covered with other tests or complex UIs.

Subcutaneous tests allow you to get a lot of the value from UI tests in that you are testing the full stack of your application with its real dependencies (apart from those external to the application like web services), but without the fragility of talking to a fragile and slow UI layer. These kinds of tests are what Jimmy classifies as “slow”, and will include integration with the database as outlined in the previous sub-section.

In his presentation, Jimmy suggests that he writes subcutaneous tests against the command/query layer (if you are using CQS). I’ve recently used subcutaneous tests from the MVC controller using a base class like this:

public abstract class SubcutaneousMvcTest<TController> : IDisposable
    where TController : Controller
{
    private DatabaseFixture _databaseFixture;
    private readonly HttpSimulator _httpRequest;
    private readonly ILifetimeScope _lifetimeScope;

    protected TController Controller { get; private set; }
    protected ControllerResultTest<TController> ActionResult { get; set; }
    protected MyAppContext SeedDbContext { get { return _databaseFixture.SeedDbContext; } }
    protected MyAppContext VerifyDbContext { get { return _databaseFixture.VerifyDbContext; } }

    protected SubcutaneousMvcTest()
    {
        _databaseFixture = new DatabaseFixture();
        _lifetimeScope = ContainerFixture.GetTestLifetimeScope(cb =>
            cb.Register(_ => _databaseFixture.WorkDbContext).AsSelf().AsImplementedInterfaces().InstancePerTestRun());
        var routes = new RouteCollection();
        RouteConfig.RegisterRoutes(routes); // This is what App_Start calls in production
        _httpRequest = new HttpSimulator().SimulateRequest(); // Simulates HttpContext.Current so I don't have to mock it
        Controller = _lifetimeScope.Resolve<TController>(); // Resolve the controller with real dependencies via ContainerFixture
        Controller.ControllerContext = new ControllerContext(new HttpContextWrapper(HttpContext.Current), new RouteData(), Controller);
        Controller.Url = new UrlHelper(Controller.Request.RequestContext, routes);
    }

    // These methods make use of my TestStack.FluentMVCTesting library so I can make nice assertions against the action result, which fits in with the BDD style
    protected void ExecuteControllerAction(Expression<Func<TController, Task<ActionResult>>> action)
    {
        ActionResult = Controller.WithCallTo(action);
    }

    protected void ExecuteControllerAction(Expression<Func<TController, ActionResult>> action)
    {
        ActionResult = Controller.WithCallTo(action);
    }

    [Fact]
    public virtual void ExecuteScenario()
    {
        this.BDDfy(); // I'm using Bddfy
    }

    protected TDependency Resolve<TDependency>()
    {
        return _lifetimeScope.Resolve<TDependency>();
    }

    public void Dispose()
    {
        _databaseFixture.Dispose();
        _httpRequest.Dispose();
        _lifetimeScope.Dispose();
    }
}

Here is an example test:

public class SuccessfulTeamResetPasswordScenario : SubcutaneousMvcTest<TeamResetPasswordController>
{
    private ResetPasswordViewModel _viewModel;
    private const string ExistingPassword = "correct_password";
    private const string NewPassword = "new_password";

    public async Task GivenATeamHasRegisteredAndIsLoggedIn()
    {
        var registeredTeam = await SeedDbConnection.SaveAsync(
            ObjectMother.Teams.Default.WithPassword(ExistingPassword));
        LoginTeam(registeredTeam);
    }

    public void AndGivenTeamSubmitsPasswordResetDetailsWithCorrectExistingPassword()
    {
        _viewModel = new ResetPasswordViewModel
        {
            ExistingPassword = ExistingPassword,
            NewPassword = NewPassword
        };
    }

    public void WhenTeamConfirmsThePasswordReset()
    {
        ExecuteControllerAction(c => c.Index(_viewModel));
    }

    public Task ThenResetThePassword()
    {
        var team = await VerifyDbConnection.Teams.SingleAsync();
        team.Password.Matches(NewPassword).ShouldBe(true); // Matches method is from BCrypt
        team.Password.Matches(ExistingPassword).ShouldNotBe(true);
    }

    public void AndTakeUserToASuccessPage()
    {
        ActionResult.ShouldRedirectTo(c => c.Success);
    }
}

Note:

  • The Object Mother with builder syntax is as per my existing article on the matter
  • I defined LoginTeam in the SubcutaneousMvcTest base class and it sets the Controller.Userobject to a ClaimsPrincipal object for the given team (what ASP.NET MVC does for me when a team is actually logged in)
  • The SaveAsync method on SeedDbConnection is an extension method in my test project that I defined that takes a builder object, calls .Build and persists the object (and returns it for terseness):
    public static class MyAppContextExtensions
    {
        public static async Task<Team> SaveAsync(this MyAppContext context, TeamBuilder builder)
        {
            var team = builder.Build();
            context.Teams.Add(team);
            await context.SaveChangesAsync();
            return team;
        }
    }
    

When to use subcutaneous tests

In my experience over the last few projects (line of business applications) I’ve found that I can write subcutaneous tests against MVC controllers and that replaces the need for most other tests (as discussed in the previous post). A distinction over the previous post is that I’m writing these tests from the MVC Controller rather than the port (the command object). By doing this I’m able to provide that extra bit of confidence that the binding from the view model through to the command layer is correct without writing extra tests. I was able to do this because I was confident that there was definitely only going to be a single UI/client and the application wasn’t likely to grow a lot in complexity. If I was sure the command layer would get reused across multiple clients then I would test from that layer and only test the controller with a mock of the port if I felt it was needed.

One thing that should be noted with this approach is that, given I’m using real database connections, the tests aren’t lightning fast, but for the applications I’ve worked on I’ve been happy with the speed of feedback, low cost and high confidence this approach has gotten me. This differs slightly from the premise in Jimmy’s talk where he favours more fast as hell tests. As I talked about above though, if speed becomes a problem you can simply adjust your approach.

I should note that when testing a web api then I have found that writing full-stack tests against an in-memory HTTP server (passed into the constructor of HttpClient) are similarly effective and it tests from something that the user/client cares about (the issuance of a HTTP request).

Review of: Ian Cooper – TDD, where did it all go wrong

This post discusses the talk “TDD, where did it all go wrong” by Ian Cooper, which was given in June 2013. See my introduction post to get the context behind this post and the other posts I have written in this series.

It’s taken me quite a few views of this video to really get my head around it. It’s fair to say that this video in combination with discussions with various colleagues such as Graeme Foster, Jess PanniBobby Lat and Matt Davies has changed the way I perform automated testing.

I’ll freely admit that I used to write a unit test across each branch of most methods in the applications I wrote and rely heavily on mocking to segregate out the external dependencies of the test (all other classes). Most of the applications I worked on were reasonably small and didn’t span for multiple years so I didn’t realise the full potential pain of this approach. In saying that, I did still found myself spending a lot of time writing tests and at times it felt tedious and time consuming in a way that didn’t feel productive. Furthermore, refactoring would sometimes result in tests breaking that really shouldn’t have. As I said – the applications I worked on were relatively small so the pain was also small enough that I put up with it assuming that was the way it needed to be.

Overview

The tl;dr of Ian’s talk is that TDD has been interpreted by a lot of people to be that you should write unit tests for every method and class that you introduce in an application, but this will necessarily result in you baking implementation details into your tests causing them to be fragile when refactoring, contain a lot of mocking, result in a high proportion of test code to implementation code and ultimately slowing you down from delivering and making changes to the codebase.

Testing behaviours rather than implementations

Ian suggests that the trigger for adding a new test to the system should be adding a new behaviour rather than adding a method or class. By doing this your tests can focus on expressing and verifying behaviours that users care about rather than implementation details that developers care about.

In my eyes this naturally fits in to BDD and ATDD by allowing you to write the bulk of your tests in that style. I feel this necessarily aligns your tests and thus implementation to things that your product owner and users care about. If you buy into the notion of tests forming an important part of a system’s documentation like I do then having tests that are behaviour focussed rather than implementation focussed is even more of an advantage since they are the tests that make sense in terms of documenting a system.

TDD and refactoring

Ian suggests that the original TDD Flow outlined by Kent Beck has been lost in translation by most people. This is summed up nicely by Steve Fenton in his summary of Ian’s talk (highlight mine):

Red. Green. Refactor. We have all heard this. I certainly had. But I didn’t really get it. I thought it meant… “write a test, make sure it fails. Write some code to pass the test. Tidy up a bit”. Not a million miles away from the truth, but certainly not the complete picture. Let’s run it again.

Red. You write a test that represents the behaviour that is needed from the system. You make it compile, but ensure the test fails. You now have a requirement for the program.

Green. You write minimal code to make the test green. This is sometimes interpreted as “return a hard-coded value” – but this is simplistic. What it really means is write code with no design, no patterns, no structure. We do it the naughty way. We just chuck lines into a method; lines that shouldn’t be in the method or maybe even in the class. Yes – we should avoid adding more implementation than the test forces, but the real trick is to do it sinfully.

Refactor. This is the only time you should add design. This is when you might extract a method, add elements of a design pattern, create additional classes or whatever needs to be done to pay penance to the sinful way you achieved green.

When you do this right, you end up with several classes that are all tested by a single test-class. This is how things should be. The tests document the requirements of the system with minimal knowledge of the implementation. The implementation could be One Massive Function or it could be a bunch of classes.

Ian points out that you cannot refactor if you have implementation details in your tests because by definition, refactoring is where you change implementation details and not the public interface or the tests.

Ports and adapters

Ian suggests that one way to test behaviours rather than implementation details is to use a ports and adapters architecture and test via the ports.

There is another video where he provides some more concrete examples of what he means. He suggests using a command dispatcher or command processor pattern as the port.

That way your adapter (e.g. MVC or API controller) can create a command object and ask for it to be executed and all of the domain logic can be wrapped up and taken care of from there. This leaves the adapter very simple and declarative and it could be easily unit tested. Ian recommends not bothering to unit test the adapter because it should be really simple and I wholeheartedly agree with this. If you use this type of pattern then your controller action will be be a few lines of code.

Here is an example from a recent project I worked on that illustrates the sort of pattern:

public class TeamEditContactDetailsController : AuthenticatedTeamController
{
    private readonly IQueryExecutor _queryExecutor;
    private readonly ICommandExecutor _commandExecutor;

    public TeamEditContactDetailsController(IQueryExecutor queryExecutor, ICommandExecutor commandExecutor)
    {
        _queryExecutor = queryExecutor;
        _commandExecutor = commandExecutor;
    }

    public async Task<ActionResult> Index()
    {
        var team = await _queryExecutor.QueryAsync(new GetTeam(TeamId));
        return View(new EditContactDetailsViewModel(team));
    }

    [HttpPost]
    public async Task<ActionResult> Index(EditContactDetailsViewModel vm)
    {
        if (!await ModelValidAndSuccess(() => _commandExecutor.ExecuteAsync(vm.ToCommand(TeamId))))
            return View(vm);

        return RedirectToAction("Success");
    }
}

This is a pretty common pattern that I end up using in a lot of my applications. ModelValidAndSuccessis a method that checks the ModelState is valid, executes the command, and if there are exceptions from the domain due to invariants being violated it will propagate them into ModelState and returnfalse. vm.ToCommand() is a method that news up the command object (in this caseEditTeamContactDetails) from the various properties bound onto the view model. Side note: some people seem to take issue with to ToCommand method, personally I’m comfortable that the purpose of the view model being to bind data and translate that data to a command object – either way, it’s by no means essential to the overall pattern.

Both the query (GetTeam) and the command (EditTeamContactDetails) can be considered ports into the domain and can be tested independently from this controller using BDD tests. At that point there is probably little value in testing this controller because it’s very declarative. Jimmy Bogard sums this up nicely in one of his posts.

Ian does say that if you feel the need to test the connection between the port and adapter then you can write some integration tests, but should be careful not to test things that are outside of your control and have already been tested (e.g. you don’t need to test ASP.NET MVC or NHibernate).

Mocking

One side effect of having unit tests for every method/class is that you are then trying to mock out every collaborator of every object and that necessarily means that you are trying to mock implementation details – the fact I used a TeamRepository or a TeamService shouldn’t matter if you are testing the ability for a team to be retrieved and viewed. I should be able to change what classes I use to get the Team without breaking tests.

Using mocks of implementation details significantly increases the fragility of tests reducing their effectiveness. Ian says in his talk that you shouldn’t mock internals, privates or adapters.

Mocks still have their place – if you want to test a port and would like to isolate it from another port (e.g. an API call to an external system) then it makes sense to mock that out. This was covered further in the previous article in the “Contract and collaboration tests” section.

Problems with higher level unit tests

I’m not advocating that this style of testing is a silver bullet – far from it. Like everything in software development it’s about trade-offs and I’m sure that there are scenarios that it won’t be suitable for. Ian covered some of the problems in his talk, I’ve already talked about the combinatorial problem and Martyn Frank covers some more in his post about Ian’s talk. I’ve listed out all of the problems I know of below.

Complex implementation

One of the questions that was raised and answered in Ian’s presentation was about what to do when the code you are implementing to make a high-level unit test test pass is really complex and you find yourself needing more guidance. In that instance you can do what Ian calls “shifting down a gear” and guide your implementation by writing lower-level, implementation-focussed unit tests. Once you have finished your implementation you can then decide whether to:

  • Throw away the tests because they aren’t needed anymore – the code is covered by your higher-level behaviour test
  • Keep the tests because you think they will be useful for the developers that have to support the application to understand the code in question
  • Keep the tests because the fact you needed them in the first place tells you that they will be useful when making changes to that code in the future

The first point is important and not something that is often considered – throwing away the tests. If you decide to keep these tests the trade-off is you have some tests tied to your implementation that will be more brittle than your other tests. The main thing to keep in mind is that you don’t have to have all of your tests at that level; just the ones that it makes sense for.

In a lot of ways this hybrid approach also helps with the combinatorial explosion problem; if the code you are testing is incredibly complex and it’s too hard to feasibly provide enough coverage with a higher level test then dive down and do those low level unit tests. I’ve found this hybrid pattern very useful for recent projects and I’ve found that only 5-10% of the code is complex enough to warrant the lower level tests.

Combinatorial explosion

I’ve covered this comprehensively in the previous article. This can be a serious problem, but as per the previous section in those instances just write the lower-level tests.

Complex tests

The other point that Ian raised is that you are interacting with more objects this might mean there is more you need to set up in your tests, which then make the arrange section of the tests harder to understand and maintain and reducing the advantage of writing the tests in the first place. Ian indicates that because you are rarely setting up mocks for complex interactions he usually sees simpler arrange sections, but he mentions that the test data buider and object mother patterns can be helpful to reduce complexity too. I have covered these patterns in the past and can confirm that they have helped me significantly in reducing complexity and improving maintainability of the arrange section of the tests.

I also make use of the excellent Bddfy and Shouldly libraries and they both make a big positive difference to the terseness and understandability of tests.

Another technique that I find to be incredibly useful is Approval Tests. If you are generating a complex artifact such as a CSV, some JSON or HTML or a complex object graph then it’s really quick and handy to approve the payload rather than have to create tedious assertions about every aspect.

In my experience, with a bit of work and by diligently refactoring your test code (this is a key point!!!) as you would your production code you can get very tidy, terse tests. You will typically have a smaller number of tests (one per user scenario rather than one for every class) and they will be organised and named around a particular user scenario (e.g.Features.TeamRegistration.SuccessfulTeamRegistrationScenario) so it should be easy to find the right test to inspect and modify when maintaining code.

Multiple test failures

It’s definitely possible that you can cause multiple tests to fail by changing one thing. I’ll be honest, I don’t really see this as a huge disadvantage and I haven’t experienced this too much in the wild. When it happens it’s generally pretty obvious what the cause is.

Shared code gets tested twice

Yep, but that’s fine because that shared code is an implementation detail – the behaviours that currently use the shared code may diverge in the future and that code may no longer be shared. The fact there is shared code is probably good since it’s a probable sign that you have been diligently refactoring your codebase and removing duplication.

Review of: J.B. Rainsberger – Integrated Tests Are A Scam

This post discusses the talk “Integrated Tests Are A Scam” by J.B. Rainsberger, which was given in November 2013. See my introduction post to get the context behind this post and the other posts I have written in this series.

Overview

There are a couple of good points in this talk and also quite a few points I disagree with.

The tl;dr of this presentation is that J.B. Rainsberger often sees people deal with the problem of having “100% of tests pass, but there is still bugs” be solved by adding integrated tests (note: this isn’t misspelt, he classifies integrated tests differently from integration tests). He likens this to using “asprin that gives you a bigger headache” and says you should instead write isolated tests that test a single object at a time and have matching collaboration and contract tests on each side of the interactions that object has with its peers. He then uses mathematical induction to prove that this will completely test each layer of the application with fast, isolated tests across all logic branches with O(n) tests rather than O(n!).

He says that integrated tests are a scam because they result in you:

  • designing code more sloppily
  • making more mistakes
  • writing fewer tests because it’s harder to write the tests due to the emerging design issues
  • thus being more likely to have a situation where “100% of tests pass, but there is still bugs”

Integrated test definition

He defines an integrated test as a test that touches “a cluster of objects” and a unit test as a test that just touches a single object and is “isolated”. By this definition and via the premise of his talk every time you write a unit test that touches multiple objects then you aren’t writing a unit test, but you are writing a “self-replicating virus that invades your projects, that threatens to destroy your codebase, that threatens your sanity and your life”. I suspect that some of that sentiment is sensationalised for the purposes of making a good talk (he even has admitted to happily writing tests at a higher level), but the talk is a very popular one and it presents a very one-sided view so I feel it’s important to point that out.

I disagree with that definition of a unit test and I think that strict approach will lead to not only writing more tests than is needed, but tying a lot of your tests to implementation details that make the tests much more fragile and less useful.

Side note: I’ll be using J.B. Rainsberger’s definitions of integrated and unit tests for this post to provide less confusing discussion in the context of his presentation.

Integrated tests and design feedback

The hypothesis that you get less feedback about the design of your software from integrated tests and thus they will result in your application becoming a mess is pretty sketchy in my opinion. Call me naive, but if you are in a team that lets the code get like that with integrated tests then I think that same team will have the same result with more fine-grained tests. If you aren’t serious about refactoring your code (regardless of whether or not you use TDD and have an explicit refactor step in your process) then yeah, it’s going to get bad. In my experience, you still get design feedback when writing your test at a higher level of your application stack with real dependencies underneath (apart from mocking dependencies external to your application) and implementing the code to make it pass.

There is a tieback here to the role of TDD in software design and I think that TDD still helps with design when writing tests that encapsulate more than a single object; it influences the design of your public interface from the level you are testing against and everything underneath is just an implementation detail of that public interface (more on this in other posts in the series).

It’s worth noting that I’m coming from the perspective of a staticly typed language where I can safely create implementation details without needing fine-grained tests to cover every little detail. I can imagine that in situations where you are writing with a dynamic language you might feel the need to make up for a lack of compiler by writing more fine-grained tests.

This is one of the reasons why I have a preference for statically typed languages – the compiler obviates the need for writing mundane tests to check things like property names are correct (althoughsome people still like to write these kinds of tests).

If you are using a dynamic language and your application structure isn’t overly complex (i.e. it’s not structured with layer upon layer upon layer) then you can probably still test from a higher level with a dynamic language without too much pain. For instance, I’ve written Angular JS applications where I’ve tested services from the controller level (with real dependencies) successfully.

It’s also relevant to consider the points that @dhh makes in his post about test-induced design damage and the subsequent TDD is dead video series.

Integrated tests and identifying problems

J.B. Rainsberger says a big problem with integrated tests is that when they fail you have no idea where the problem is. I agree that by glancing at the name of the test that’s broken you might not immediately know which line of code is at fault.

If you structure your tests and code well then usually when there is a test failure in a higher level test you can look at the exception message to get a pretty good idea unless it’s a generic exception like a NullReferenceException. In those scenarios you can spend a little bit more time and look at the stack trace to nail down the offending line of code. This is slower, but I personally think that the trade-off that you get (as discussed throughout this series) it worth this small increase.

Motivation to write integrated tests

J.B. Rainsberger puts forward that the motivation people usually have to write integrated tests is to find bugs that unit tests can’t uncover by testing the interaction between objects. While it is a nice benefit to test the code closer to how it’s executed in production, with real collaborators, it’s not the main reason I write tests that cover multiple objects. I write this style of tests because they allow us to divorce the tests from knowing about implementation details and write them at a much more useful level of abstraction, closer to what the end user cares about. This gives full flexibility to refactor code without breaking tests since the tests can describe user scenarios rather than developer-focussed implementation concerns. It’s appropriate to name drop BDD and ATDD here.

Integrated tests necessitate a combinatorial explosion in the number of tests

Lack of design feedback aside, the main criticism that J.B. Rainsberger has against integrated tests is that to test all pathways through a codebase you need to have a combinatorial explosion of tests (O(n!)). While this is technically true I question the practicality of this argument for a well designed system:

  • He seems to suggest that you are going to build software that contains a lot of layers and within each layer you will have a lot of branches. While I’m sure there are examples out there like that, most of the applications I’ve been involved with can be architected to be relatively flat.
  • It’s possible that I just haven’t written code for the right industries and thus haven’t across the kinds of scenarios he is envisaging, but at best it demonstrates that his sweeping statements don’t always apply and you should take a pragmatic approach based on your codebase.
  • Consider the scenario where you add a test against new functionality against the behaviour of the system from the user’s perspective (e.g. a BDD style test for each acceptance criteria in your user story). In that scenario, then, being naive for a moment, all code that you add could be tested by these higher level tests.
  • Naivety aside, you will add code that doesn’t directly relate to the acceptance criteria, this might be infrastructure code or defensive programming, or logging etc. and in those cases I think you just need to evaluate how important it is to test that code:
    • Sometimes the code is very declarative and obviously wrong or right – in those instances, where there is unlikely to be complex interactions with other parts of the code (null checks being a great example) then I generally don’t think it needs to be tested
    • Sometimes it’s common code that will necessarily be tested by any integrated (or UI) tests you do have anyway
    • Sometimes it’s code that is complex or important enough to warrant specific, “implementation focussed”, unit tests – add them!
    • If such code didn’t warrant a test and later turns out to introduce a bug then that gives you feedback that it wasn’t that obvious afterall and at that point you can introduce a breaking test so it never happens again (before fixing it – that’s the first step you take when you have a bug right?)
    • If the above point makes you feel uncomfortable then you should go and look up continuous delivery and work on ensuring your team works towards the capability to deliver code to production at any time so rolling forward with fixes is fast and efficient
    • It’s important to keep in mind that the point of testing generally isn’t to get 100% coverage, it’s to give you confidence that your application is going to work – I talk about this more later in the series – I can think of industries where this is probably different (e.g. healthcare, aerospace) so as usual be pragmatic!
  • There will always be exceptions to the rule, if you find a part of the codebase that is more complex and does require a lower level test to feasibily cover all of the combinations then do it – that doesn’t mean you should write those tests for the whole system though.

Contract and collaboration tests

The part about J.B. Rainsberger’s presentation that I did like was his solution to the “problem”. While I think it’s fairly basic, common-sense advice that a lot of people probably follow I still think it’s good advice.

He describes that, where you have two objects collaborating with each other, you might consider one object to be the “server” and the other the “client” and the server can be tested completely independently of the client since it will simply expose a public API that can be called. In that scenario, he suggests that the following set of tests should be written:

  • The client should have a “collaboration” test to check that it asks the right questions of the next layer by using an expectation on a mock of the server
  • The client should also have a set of collaboration tests to check it responds correctly to the possible responses that the server can return (e.g. 0 results, 1 result, a few results, lots of results, throws an exception)
  • The server should have a “contract” test to check that it tries to answer the question from the client that matches the expectation in the client’s collaboration test
  • The server should also have a set of contract tests to check that it can reply correctly with the same set of responses tested in the client’s collaboration tests

While I disagree with applying this class-by-class through every layer of your application I think that you can and should still apply this at any point that you do need to make a break between two parts of your code that you want to test independently. This type of approach also works well when testing across separate systems/applications too. When testing across systems it’s worth looking at consumer-driven contracts and in particular at Pact (.NET version).

Unit, integration, subcutaneous, UI, fast, slow, mocks, TDD, isolation and scams… What is this? I don’t even!

As outlined in the first post of my Automated Testing blog series I’ve been on a journey of self reflection and discovery about how best to write, structure and maintain automated tests.

The most confusing and profound realisations that I’ve had relate to how best to cover a codebase in tests and what type and speed those tests should be. The sorts of questions and considerations that come to mind about this are:

  • Should I be writing unit, subcutaneous, integration, etc. tests to cover a particular piece of code?
  • What is a unit test anyway? Everyone seems to have a different definition!
  • How do I get feedback as fast as possible – reducing feedback loops is incredibly important.
  • How much time/effort are we prepared to spend testing our software and what level of coverage do we need in return?
  • How do I keep my tests maintainable and how do I reduce the number of tests that break when I need to make a change to the codebase?
  • How do I make sure that my tests give me the maximum confidence that when the code is shipped to production it will work?
  • When should I be mocking the database, filesystem etc.
  • How do I ensure that my application is tested consistently?

In order to answer these questions and more I’ve watched a range of videos and read a number of blog posts from prominent people, spent time experimenting and reflecting on the techniques via the projects I work on (both professionally and with my Open Source Software work) and tried to draw my own conclusions.

There are some notable videos that I’ve come across that, in particular, have helped me with my learning and realisations so I’ve created a series of posts around them (and might add to it over time if I find other posts). I’ve tried to summarise the main points I found interesting from the material as well as injecting my own thoughts and experience where relevant.

There is a great talk by Gary Bernhardt called Boundaries. For completeness, it is worth looking at in relation to the topics discussed in the above articles. I don’t have much to say about this yet (I’m still getting my head around where it fits in) apart from the fact that code that maps input(s) to output(s) without side effects are obviously very easy to test and I’ve found that where I have used immutable value objects in my domain model it has made testing easier.

Summary

I will summarise my current thoughts (this might change over time) by revisiting the questions I posed above:

  • Should I be writing unit, subcutaneous, integration, etc. tests to cover a particular piece of code?
    • Typical consultant answer: it depends. In general I’d say write the fastest possible test you can that gives you the minimum required confidence and bakes in the minimum amount of implementation details.
    • I’ve had a lot of luck covering line-of-business web apps with mostly subcutaneous tests against the MVC controllers, with a smattering of unit tests to check conventions and test really complex logic and I typically see how far I can get without writing UI tests, but when I do I test high-value scenarios or complex UIs.
  • What is a unit test anyway? Everyone seems to have a different definition!
  • How do I get feedback as fast as possible – reducing feedback loops is incredibly important.
    • Follow Jimmy’s advice and focus on writing as many tests that are as fast as possible rather than worrying about whether a test is a unit test or integration test.
    • Be pragmmatic though, you might get adequate speed, but a higher level of confidence by integrating your tests with the database for instance (this has worked well for me)
  • How much time/effort are we prepared to spend testing our software and what level of coverage do we need in return?
    • I think it depends on the application – the product owner, users and business in general will all have different tolerances for risk of something going wrong. Do the minimum amount that’s needed to get the amount of confidence that is required.
    • In general I try and following the mantra of “challenge yourself to start simple then inspect and adapt” (thanks Jess for helping refine that). Start off with the simplest testing approach that will work and if you find you are spending too long writing tests or the tests don’t give you the right confidence then adjust from there.
  • How do I keep my tests maintainable and how do I reduce the number of tests that break when I need to make a change to the codebase?
    • Focus on removing implementation details from tests. Be comfortable testing multiple classes in a single test (use your production DI container!).
    • Structure the tests according to user behaviour – they are less likely to have implementation details and they form better documentation of the system.
  • How do I make sure that my tests give me the maximum confidence that when the code is shipped to production it will work?
    • Reduce the amount of mocking you use to the bare minimum – hopefully just things external to your application so that you are testing production-like code paths.
    • Subcutaneous tests are a very good middle ground between low-level implementation-focused unit tests and slow and brittle UI tests.
  • When should I be mocking the database, filesystem etc.
    • When you need the speed and are happy to forgo the lower confidence.
    • Also, if they are external to your application or not completely under your application’s control e.g. a database that is touched by multiple apps and your app doesn’t run migrations on it and control the schema.
  • How do I ensure that my application is tested consistently?
    • Come up with a testing strategy and stick with it. Adjust it over time as you learn new things though.
    • Don’t be afraid to use different styles of test as appropriate – e.g. the bulk of tests might be subcutaneous, but you might decide to write lower level unit tests for complex logic.

In closing, I wanted to show a couple of quotes that I think are relevant:

Fellow Readifarian, Kahne Raja recently said this on an internal Yammer discussion and I really identify with it:

“We should think about our test projects like we think about our solution projects. They involve complex design patterns and regular refactoring.”

Another Readifarian, Pawel Pabich , made the important point that:

“[The tests you write] depend[s] on the app you are writing. [A] CRUD Web app might require different tests than a calculator.”

I also like this quote from Kent Beck:

“I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence.”

Blog about software engineering, web development, agile, C#, ASP.NET and Windows Azure.