Robert Daniel Moore’s Blog

Posts - Page 3 of 22

Whitepaper: Managing Database Schemas in a Continuous Delivery World

A whitepaper I wrote for my employer, Readify, just got published. Feel free to check it out. I’ve included the abstract below.

One of the trickier technical problems to address when moving to a continuous delivery development model is managing database schema changes. It’s much harder to to roll back or roll forward database changes compared to software changes since by definition it has state. Typically, organisations address this problem by having database administrators (DBAs) manually apply changes so they can manually correct any problems, but this has the downside of providing a bottleneck to deploying changes to production and also introduces human error as a factor.

A large part of continuous delivery involves the setup of a largely automated deployment pipeline that increases confidence and reduces risk by ensuring that software changes are deployed consistently to each environment (e.g. dev, test, prod).

To fit in with that model it’s important to automate database changes so that they are applied automatically and consistently to each environment thus increasing the likelihood of problems being found early and reducing the risk associated with database changes.

This report outlines an approach to managing database schema changes that is compatible with a continuous delivery development model, the reasons why the approach is important and some of the considerations that need to be made when taking this approach.

The approaches discussed in this document aren’t specific to continuous delivery and in fact should be considered regardless of your development model.

Review of: Jimmy Bogard - Holistic Testing

This post discusses the talk “Holistic Testing” by Jimmy Bogard, which was given in June 2013. See my introduction post to get the context behind this post and the other posts I have written in this series.

I really resonate with the points raised by Jimmy since I’ve been using a lot of similar techniques recently. In this article I outline how I’ve been using the techniques talked about by Jimmy (including code snippets for context).

Overview

In this insightful presentation Jimmy outlines the testing strategy that he tends to use for the projects he works on. He covers the level that he tests from, the proportion of the different types of tests he writes and covers the intimate technical detail about how he implements the tests. Like Ian Cooper, Jimmy likes writing his unit tests from a relatively high level in his application, specifically he said that he likes the definition of unit test to be:

“Units of behaviour, isolated from other units of behaviour”

Code coverage and shipping code

“The ultimate goal here is to ship code it’s not to write tests; tests are just a means to the end of shipping code.”

“I can have 100% code coverage and have noone use my product and I can have 0% code coverage and it’s a huge success; there is no correlation between the two things.”

Enough said.

Types of tests

Jimmy breathes a breath of fresh air when throwing away the testing pyramid (and all the conflicting definitions of unit, integration, etc. tests) in favour of a pyramid that has a small number of “slow as hell tests”, a slightly larger number of “slow” tests and a lot of “fast” tests.

This takes away the need to classify if a test is unit, integration or otherwise and focuses on the important part - how fast can you get feedback from that test. This is something that I’ve often said - there is no point in distinguishing between unit and integration tests in your project until the moment that you need to separate out tests because your feedback cycle is too slow (which will take a while in a greenfield project).

It’s worth looking at the ideas expressed by Sebastien Lambla on Vertical Slide Testing (VEST), which provides another interesting perspective in this area by turning your traditionally slow “integration” tests into fast in-memory tests. Unfortunately, the idea seems to be fairly immature and there isn’t a lot of support for this type of approach.

Mocks

Similar to the ideas expressed by Ian Cooper, Jimmy tells us not to mock internal implementation details (e.g. collaborators passed into the constructor) and indicates that he rarely uses mocks. In fact he admitted that he would rather make the process of using mocks more painful and hand rolling them to discourage their use unless it’s necessary.

Jimmy says that he creates “seams” for the things he can’t control or doesn’t own (e.g. webservices, databases, etc.) and then mocks those seams when writing his test.

The cool thing about hand-rolled mocks that I’ve found is that you can codify real-like behaviour (e.g. interactions between calls and real-looking responses) and contain that behaviour in one place (helping to form documentation about how the thing being mocked works). These days I tend to use a combination of hand-rolled mocks for some things and NSubstitute for others. I’ll generally use hand-rolled mocks when I want to codify behaviour or if I want to provide a separate API surface area to interact with the mock e.g.:

// Interface
public interface IDateTimeProvider
{
    DateTimeOffset Now();
}
// Production code
public class DateTimeProvider : IDateTimeProvider
{
    public DateTimeOffset Now()
    {
        return DateTimeOffset.UtcNow;
    }
}
// Hand-rolled mock
public class StaticDateTimeProvider : IDateTimeProvider
{
    private DateTimeOffset _now;
    public StaticDateTimeProvider()
    {
        _now = DateTimeOffset.UtcNow;
    }
    public StaticDateTimeProvider(DateTimeOffset now)
    {
        _now = now;
    }
    // This one is good for data-driven tests that take a string representation of the date
    public StaticDateTimeProvider(string now)
    {
        _now = DateTimeOffset.Parse(now);
    }
    public DateTimeOffset Now()
    {
        return _now;
    }
    public StaticDateTimeProvider SetNow(string now)
    {
        _now = DateTimeOffset.Parse(now);
        return this;
    }
    public StaticDateTimeProvider MoveTimeForward(TimeSpan amount)
    {
        _now = _now.Add(amount);
        return this;
    }
}

Container-driven unit tests

One of the most important points that Jimmy raises in his talk is that he uses his DI container to resolve dependencies in his “fast” tests. This makes a lot of sense because it allows you to:

Prevent implementation detail leaking into your test by resolving the component under test and all of it’s real dependencies without needing to know the dependencies
Mimic what happens in production
Easily provide mocks for those things that do need mocks without needing to know what uses those mocks

Container initialisation can be (relatively) slow so in order to ensure this cost is incurred once you can simply set up a global fixture or static instance of the initialised container.

The other consideration is how to isolate the container across test runs - if you modify a mock for instance then you don’t want that mock to be returned in the next test. Jimmy overcomes this by using child containers, which he has separately blogged about.

The other interesting thing that Jimmy does is uses an extension of AutoFixture’s AutoDataAttribute attribute to resolve parameters to his test method from the container. It’s pretty nifty and explained in more detail by Sebastian Weber.

I’ve recently used a variation of the following test fixture class (in my case using Autofac):

public static class ContainerFixture
{
    private static readonly IContainer Container;
    static ContainerFixture()
    {
        Container = ContainerConfig.CreateContainer(); // This is what my production App_Start calls
        AppDomain.CurrentDomain.DomainUnload += (sender, args) => Container.Dispose();
    }
    public static ILifetimeScope GetTestLifetimeScope(Action<ContainerBuilder> modifier = null)
    {
        return Container.BeginLifetimeScope(MatchingScopeLifetimeTags.RequestLifetimeScopeTag, cb => {
            ExternalMocks(cb);
            if (modifier != null)
                modifier(cb);
        });
    }
    private static void ExternalMocks(ContainerBuilder cb)
    {
        cb.Register(_ => new StaticDateTimeProvider(DateTimeOffset.UtcNow.AddMinutes(1)))
            .AsImplementedInterfaces()
            .AsSelf()
            .InstancePerTestRun();
        // Other overrides of externals to the application ...
    }
}
public static class RegistrationExtensions
{
    // This extension method makes the registrations in the ExternalMocks method clearer in intent - I create a HTTP request lifetime around each test since I'm using my container in a web app
    public static IRegistrationBuilder<TLimit, TActivatorData, TStyle> InstancePerTestRun
        <TLimit, TActivatorData, TStyle>(this IRegistrationBuilder<TLimit, TActivatorData, TStyle> registration,
            params object[] lifetimeScopeTags)
    {
        return registration.InstancePerRequest(lifetimeScopeTags);
    }
}

Isolating the database

Most applications that I come across will have a database of some sort. Including a database connection usually means out of process communication and this likely turns your test from “fast” to “slow” in Jimmy’s terminology. It also makes it harder to write a good test since databases are stateful and thus we need to isolate tests against each other. It’s often difficult to run tests in parallel against the same database as well.

There are a number of ways of dealing with this, which Jimmy outlined in his talk and also on his blog:

Use a transaction and rollback at the end of the test. The tricky thing here is making sure that you simulate multiple requests - you need to make sure that your seeding, work and verification all happen separately otherwise your ORM caching might give you a false positive. I find this to be quite an effective strategy and it’s what I’ve used for years now in various forms.
- One option is to use TransactionScope to transparently initiate a transaction and rollback that allows multiple database connections to connect to the database and you can have real, committed transactions that will then get rolled back. The main downsides are that you need MSDTC enabled on all dev machines and your CI server agents and you can’t run tests in parallel against the same database.
- Another option is to initiate a single connection with a transaction and then to reuse that connection across your ORM contexts - this allows you to avoid MSDTC and run tests in parallel, but it also means you can’t use explicit transactions in your code (or to make them noops for your test code) and it’s not possible with all ORMs. I can’t claim credit for this idea - I was introduced to it by Jess Panni and Matt Davies.
- If your ORM doesn’t support attaching multiple contexts to a single connection with an open transaction (hi NHibernate!) then another option would be to clear the cache after seeding and after work. This has the same advantages and disadvantages as the previous point.
Drop/recreate the database each test run.
- The most practical way to do this is to use some sort of in-memory variation e.g. sqlite in-memory, Raven in-memory, Effort for Entity Framework and the upcoming Entity Framework 7 in-memory provider
- This has the advantage of working in-process and thus you might be able to make the test a “fast” test
- This allows you to run tests in parallel and isolated from each other by wiping the database every test run
- The downside is the database is very different from your production database and in fact might not have some features your code needs
- Also, it might be difficult to migrate the database to the correct schema (e.g. sqlite doesn’t support ALTER statements) so you are stuck with getting your ORM to automatically generate the schema for you rather than testing your migrations
- Additionally, it can actually be quite slow to regenerate the schema every test run as the complexity of your schema grows
Delete all of the non-seed data in the database every test run - this can be quite tricky to get right without violating foreign keys, but Jimmy has some clever SQL scripts for it (in the above-linked article) and finds that it’s quite a fast option.
Ensure that the data being entered by each test will necessarily be isolated from other test runs (e.g. random GUID ids etc.) - the danger here is that it can get quite complex to keep on top of this and it’s likely your tests will be fragile - I generally wouldn’t recommend this option.

I generally find that database integration tests are reasonably fast (after the initial spin-up time for EntityFramework or NHibernate). For instance, in a recent project the first database test would take 26s and subsequent tests took ~15ms for a test with an empty database query, ~30-50ms for a fairly basic query test with populated data and 100-200ms for a more complex test with a lot more database interaction.

In some cases I will write all of my behavioural tests touching the database because the value of testing against a production-like database with the real SQL being issued against the real migrations is incredibly valuable in terms of confidence. If you are using a DI container in your tests I’m sure that it would be possible to run the test suite in two different modes - one with an in-memory variant and parallelisation to get fast feedback and one with full database integration for full confidence. If you had a project that was big enough that the feedback time was getting too large investigating this type of approach is worth it - I personally haven’t found a need yet.

I’ve recently been using variations on this fixture class to set up the database integration for my tests using Entity Framework:

public class DatabaseFixture : IDisposable
{
    private readonly MyAppContext _parentContext;
    private readonly DbTransaction _transaction;
    static DatabaseFixture()
    {
        var testPath = Path.GetDirectoryName(typeof (DatabaseFixture).Assembly.CodeBase.Replace("file:///", ""));
        AppDomain.CurrentDomain.SetData("DataDirectory", testPath); // For localdb connection string that uses |DataDirectory|
        using (var migrationsContext = new MyAppContext())
        {
            migrationsContext.Database.Initialize(false); // Performs EF migrations
        }
    }
    public DatabaseFixture()
    {
        _parentContext = new MyAppContext();
        _parentContext.Database.Connection.Open(); // This could be a simple SqlConnection if using sql express, but if using localdb you need a context so that EF creates the database if it doesn't exist (thanks EF!)
        _transaction = _parentContext.Database.Connection.BeginTransaction();
        SeedDbContext = GetNewDbContext();
        WorkDbContext = GetNewDbContext();
        VerifyDbContext = GetNewDbContext();
    }
    public MyAppContext SeedDbContext { get; private set; }
    public MyAppContext WorkDbContext { get; private set; }
    public MyAppContext VerifyDbContext { get; private set; }
    private MyAppContext GetNewDbContext()
    {
        var context = new MyAppContext(_parentContext.Database.Connection);
        context.Database.UseTransaction(_transaction);
        return context;
    }
    public void Dispose()
    {
        SeedDbContext.Dispose();
        WorkDbContext.Dispose();
        VerifyDbContext.Dispose();
        _transaction.Dispose(); // Discard any inserts/updates since we didn't commit
        _parentContext.Dispose();
    }
}

Subcutaneous testing

It’s pretty well known/documentated that UI-tests are slow and unless you get them right are brittle. Most people recommend that you only test happy paths. Jimmy classifies UI tests in his “slow as hell” category and also recommends on testing (important) happy paths. I like to recommend that UI tests are used for high value scenarios (such as a user performing the primary action that makes you money), functionality that continually breaks and can’t be adequately covered with other tests or complex UIs.

Subcutaneous tests allow you to get a lot of the value from UI tests in that you are testing the full stack of your application with its real dependencies (apart from those external to the application like web services), but without the fragility of talking to a fragile and slow UI layer. These kinds of tests are what Jimmy classifies as “slow”, and will include integration with the database as outlined in the previous sub-section.

In his presentation, Jimmy suggests that he writes subcutaneous tests against the command/query layer (if you are using CQS). I’ve recently used subcutaneous tests from the MVC controller using a base class like this:

public abstract class SubcutaneousMvcTest<TController> : IDisposable
    where TController : Controller
{
    private DatabaseFixture _databaseFixture;
    private readonly HttpSimulator _httpRequest;
    private readonly ILifetimeScope _lifetimeScope;
    protected TController Controller { get; private set; }
    protected ControllerResultTest<TController> ActionResult { get; set; }
    protected MyAppContext SeedDbContext { get { return _databaseFixture.SeedDbContext; } }
    protected MyAppContext VerifyDbContext { get { return _databaseFixture.VerifyDbContext; } }
    protected SubcutaneousMvcTest()
    {
        _databaseFixture = new DatabaseFixture();
        _lifetimeScope = ContainerFixture.GetTestLifetimeScope(cb =>
            cb.Register(_ => _databaseFixture.WorkDbContext).AsSelf().AsImplementedInterfaces().InstancePerTestRun());
        var routes = new RouteCollection();
        RouteConfig.RegisterRoutes(routes); // This is what App_Start calls in production
        _httpRequest = new HttpSimulator().SimulateRequest(); // Simulates HttpContext.Current so I don't have to mock it
        Controller = _lifetimeScope.Resolve<TController>(); // Resolve the controller with real dependencies via ContainerFixture
        Controller.ControllerContext = new ControllerContext(new HttpContextWrapper(HttpContext.Current), new RouteData(), Controller);
        Controller.Url = new UrlHelper(Controller.Request.RequestContext, routes);
    }
    // These methods make use of my TestStack.FluentMVCTesting library so I can make nice assertions against the action result, which fits in with the BDD style
    protected void ExecuteControllerAction(Expression<Func<TController, Task<ActionResult>>> action)
    {
        ActionResult = Controller.WithCallTo(action);
    }
    protected void ExecuteControllerAction(Expression<Func<TController, ActionResult>> action)
    {
        ActionResult = Controller.WithCallTo(action);
    }
    [Fact]
    public virtual void ExecuteScenario()
    {
        this.BDDfy(); // I'm using Bddfy
    }
    protected TDependency Resolve<TDependency>()
    {
        return _lifetimeScope.Resolve<TDependency>();
    }
    public void Dispose()
    {
        _databaseFixture.Dispose();
        _httpRequest.Dispose();
        _lifetimeScope.Dispose();
    }
}

Here is an example test:

public class SuccessfulTeamResetPasswordScenario : SubcutaneousMvcTest<TeamResetPasswordController>
{
    private ResetPasswordViewModel _viewModel;
    private const string ExistingPassword = "correct_password";
    private const string NewPassword = "new_password";
    public async Task GivenATeamHasRegisteredAndIsLoggedIn()
    {
        var registeredTeam = await SeedDbConnection.SaveAsync(
            ObjectMother.Teams.Default.WithPassword(ExistingPassword));
        LoginTeam(registeredTeam);
    }
    public void AndGivenTeamSubmitsPasswordResetDetailsWithCorrectExistingPassword()
    {
        _viewModel = new ResetPasswordViewModel
        {
            ExistingPassword = ExistingPassword,
            NewPassword = NewPassword
        };
    }
    public void WhenTeamConfirmsThePasswordReset()
    {
        ExecuteControllerAction(c => c.Index(_viewModel));
    }
    public Task ThenResetThePassword()
    {
        var team = await VerifyDbConnection.Teams.SingleAsync();
        team.Password.Matches(NewPassword).ShouldBe(true); // Matches method is from BCrypt
        team.Password.Matches(ExistingPassword).ShouldNotBe(true);
    }
    public void AndTakeUserToASuccessPage()
    {
        ActionResult.ShouldRedirectTo(c => c.Success);
    }
}

Note:

The Object Mother with builder syntax is as per my existing article on the matter
I defined LoginTeam in the SubcutaneousMvcTest base class and it sets the Controller.Userobject to a ClaimsPrincipal object for the given team (what ASP.NET MVC does for me when a team is actually logged in)

The SaveAsync method on SeedDbConnection is an extension method in my test project that I defined that takes a builder object, calls .Build and persists the object (and returns it for terseness):

  public static class MyAppContextExtensions
  {
      public static async Task<Team> SaveAsync(this MyAppContext context, TeamBuilder builder)
      {
          var team = builder.Build();
          context.Teams.Add(team);
          await context.SaveChangesAsync();
          return team;
      }
  }

When to use subcutaneous tests

In my experience over the last few projects (line of business applications) I’ve found that I can write subcutaneous tests against MVC controllers and that replaces the need for most other tests (as discussed in the previous post). A distinction over the previous post is that I’m writing these tests from the MVC Controller rather than the port (the command object). By doing this I’m able to provide that extra bit of confidence that the binding from the view model through to the command layer is correct without writing extra tests. I was able to do this because I was confident that there was definitely only going to be a single UI/client and the application wasn’t likely to grow a lot in complexity. If I was sure the command layer would get reused across multiple clients then I would test from that layer and only test the controller with a mock of the port if I felt it was needed.

One thing that should be noted with this approach is that, given I’m using real database connections, the tests aren’t lightning fast, but for the applications I’ve worked on I’ve been happy with the speed of feedback, low cost and high confidence this approach has gotten me. This differs slightly from the premise in Jimmy’s talk where he favours more fast as hell tests. As I talked about above though, if speed becomes a problem you can simply adjust your approach.

I should note that when testing a web api then I have found that writing full-stack tests against an in-memory HTTP server (passed into the constructor of HttpClient) are similarly effective and it tests from something that the user/client cares about (the issuance of a HTTP request).

Review of: Ian Cooper - TDD, where did it all go wrong

This post discusses the talk “TDD, where did it all go wrong” by Ian Cooper, which was given in June 2013. See my introduction post to get the context behind this post and the other posts I have written in this series.

It’s taken me quite a few views of this video to really get my head around it. It’s fair to say that this video in combination with discussions with various colleagues such as Graeme Foster, Jess Panni, Bobby Lat and Matt Davies has changed the way I perform automated testing.

I’ll freely admit that I used to write a unit test across each branch of most methods in the applications I wrote and rely heavily on mocking to segregate out the external dependencies of the test (all other classes). Most of the applications I worked on were reasonably small and didn’t span for multiple years so I didn’t realise the full potential pain of this approach. In saying that, I did still found myself spending a lot of time writing tests and at times it felt tedious and time consuming in a way that didn’t feel productive. Furthermore, refactoring would sometimes result in tests breaking that really shouldn’t have. As I said - the applications I worked on were relatively small so the pain was also small enough that I put up with it assuming that was the way it needed to be.

Overview

The tl;dr of Ian’s talk is that TDD has been interpreted by a lot of people to be that you should write unit tests for every method and class that you introduce in an application, but this will necessarily result in you baking implementation details into your tests causing them to be fragile when refactoring, contain a lot of mocking, result in a high proportion of test code to implementation code and ultimately slowing you down from delivering and making changes to the codebase.

Testing behaviours rather than implementations

Ian suggests that the trigger for adding a new test to the system should be adding a new behaviour rather than adding a method or class. By doing this your tests can focus on expressing and verifying behaviours that users care about rather than implementation details that developers care about.

In my eyes this naturally fits in to BDD and ATDD by allowing you to write the bulk of your tests in that style. I feel this necessarily aligns your tests and thus implementation to things that your product owner and users care about. If you buy into the notion of tests forming an important part of a system’s documentation like I do then having tests that are behaviour focussed rather than implementation focussed is even more of an advantage since they are the tests that make sense in terms of documenting a system.

TDD and refactoring

Ian suggests that the original TDD Flow outlined by Kent Beck has been lost in translation by most people. This is summed up nicely by Steve Fenton in his summary of Ian’s talk (highlight mine):

Red. Green. Refactor. We have all heard this. I certainly had. But I didn’t really get it. I thought it meant… “write a test, make sure it fails. Write some code to pass the test. Tidy up a bit”. Not a million miles away from the truth, but certainly not the complete picture. Let’s run it again.

Red. You write a test that represents the behaviour that is needed from the system. You make it compile, but ensure the test fails. You now have a requirement for the program.

Green. You write minimal code to make the test green. This is sometimes interpreted as “return a hard-coded value” - but this is simplistic. What it really means is write code with no design, no patterns, no structure. We do it the naughty way. We just chuck lines into a method; lines that shouldn’t be in the method or maybe even in the class. Yes - we should avoid adding more implementation than the test forces, but the real trick is to do it sinfully.

Refactor. This is the only time you should add design. This is when you might extract a method, add elements of a design pattern, create additional classes or whatever needs to be done to pay penance to the sinful way you achieved green.

When you do this right, you end up with several classes that are all tested by a single test-class. This is how things should be. The tests document the requirements of the system with minimal knowledge of the implementation. The implementation could be One Massive Function or it could be a bunch of classes.

Ian points out that you cannot refactor if you have implementation details in your tests because by definition, refactoring is where you change implementation details and not the public interface or the tests.

Ports and adapters

Ian suggests that one way to test behaviours rather than implementation details is to use a ports and adapters architecture and test via the ports.

There is another video where he provides some more concrete examples of what he means. He suggests using a command dispatcher or command processor pattern as the port.

That way your adapter (e.g. MVC or API controller) can create a command object and ask for it to be executed and all of the domain logic can be wrapped up and taken care of from there. This leaves the adapter very simple and declarative and it could be easily unit tested. Ian recommends not bothering to unit test the adapter because it should be really simple and I wholeheartedly agree with this. If you use this type of pattern then your controller action will be be a few lines of code.

Here is an example from a recent project I worked on that illustrates the sort of pattern:

public class TeamEditContactDetailsController : AuthenticatedTeamController
{
    private readonly IQueryExecutor _queryExecutor;
    private readonly ICommandExecutor _commandExecutor;
    public TeamEditContactDetailsController(IQueryExecutor queryExecutor, ICommandExecutor commandExecutor)
    {
        _queryExecutor = queryExecutor;
        _commandExecutor = commandExecutor;
    }
    public async Task<ActionResult> Index()
    {
        var team = await _queryExecutor.QueryAsync(new GetTeam(TeamId));
        return View(new EditContactDetailsViewModel(team));
    }
    [HttpPost]
    public async Task<ActionResult> Index(EditContactDetailsViewModel vm)
    {
        if (!await ModelValidAndSuccess(() => _commandExecutor.ExecuteAsync(vm.ToCommand(TeamId))))
            return View(vm);
        return RedirectToAction("Success");
    }
}

This is a pretty common pattern that I end up using in a lot of my applications. ModelValidAndSuccessis a method that checks the ModelState is valid, executes the command, and if there are exceptions from the domain due to invariants being violated it will propagate them into ModelState and returnfalse. vm.ToCommand() is a method that news up the command object (in this caseEditTeamContactDetails) from the various properties bound onto the view model. Side note: some people seem to take issue with to ToCommand method, personally I’m comfortable that the purpose of the view model being to bind data and translate that data to a command object - either way, it’s by no means essential to the overall pattern.

Both the query (GetTeam) and the command (EditTeamContactDetails) can be considered ports into the domain and can be tested independently from this controller using BDD tests. At that point there is probably little value in testing this controller because it’s very declarative. Jimmy Bogard sums this up nicely in one of his posts.

Ian does say that if you feel the need to test the connection between the port and adapter then you can write some integration tests, but should be careful not to test things that are outside of your control and have already been tested (e.g. you don’t need to test ASP.NET MVC or NHibernate).

Mocking

One side effect of having unit tests for every method/class is that you are then trying to mock out every collaborator of every object and that necessarily means that you are trying to mock implementation details - the fact I used a TeamRepository or a TeamService shouldn’t matter if you are testing the ability for a team to be retrieved and viewed. I should be able to change what classes I use to get the Team without breaking tests.

Using mocks of implementation details significantly increases the fragility of tests reducing their effectiveness. Ian says in his talk that you shouldn’t mock internals, privates or adapters.

Mocks still have their place - if you want to test a port and would like to isolate it from another port (e.g. an API call to an external system) then it makes sense to mock that out. This was covered further in the previous article in the “Contract and collaboration tests” section.

Problems with higher level unit tests

I’m not advocating that this style of testing is a silver bullet - far from it. Like everything in software development it’s about trade-offs and I’m sure that there are scenarios that it won’t be suitable for. Ian covered some of the problems in his talk, I’ve already talked about the combinatorial problem and Martyn Frank covers some more in his post about Ian’s talk. I’ve listed out all of the problems I know of below.

Complex implementation

One of the questions that was raised and answered in Ian’s presentation was about what to do when the code you are implementing to make a high-level unit test test pass is really complex and you find yourself needing more guidance. In that instance you can do what Ian calls “shifting down a gear” and guide your implementation by writing lower-level, implementation-focussed unit tests. Once you have finished your implementation you can then decide whether to:

Throw away the tests because they aren’t needed anymore - the code is covered by your higher-level behaviour test
Keep the tests because you think they will be useful for the developers that have to support the application to understand the code in question
Keep the tests because the fact you needed them in the first place tells you that they will be useful when making changes to that code in the future

The first point is important and not something that is often considered - throwing away the tests. If you decide to keep these tests the trade-off is you have some tests tied to your implementation that will be more brittle than your other tests. The main thing to keep in mind is that you don’t have to have all of your tests at that level; just the ones that it makes sense for.

In a lot of ways this hybrid approach also helps with the combinatorial explosion problem; if the code you are testing is incredibly complex and it’s too hard to feasibly provide enough coverage with a higher level test then dive down and do those low level unit tests. I’ve found this hybrid pattern very useful for recent projects and I’ve found that only 5-10% of the code is complex enough to warrant the lower level tests.

Combinatorial explosion

I’ve covered this comprehensively in the previous article. This can be a serious problem, but as per the previous section in those instances just write the lower-level tests.

Complex tests

The other point that Ian raised is that you are interacting with more objects this might mean there is more you need to set up in your tests, which then make the arrange section of the tests harder to understand and maintain and reducing the advantage of writing the tests in the first place. Ian indicates that because you are rarely setting up mocks for complex interactions he usually sees simpler arrange sections, but he mentions that the test data buider and object mother patterns can be helpful to reduce complexity too. I have covered these patterns in the past and can confirm that they have helped me significantly in reducing complexity and improving maintainability of the arrange section of the tests.

I also make use of the excellent Bddfy and Shouldly libraries and they both make a big positive difference to the terseness and understandability of tests.

Another technique that I find to be incredibly useful is Approval Tests. If you are generating a complex artifact such as a CSV, some JSON or HTML or a complex object graph then it’s really quick and handy to approve the payload rather than have to create tedious assertions about every aspect.

In my experience, with a bit of work and by diligently refactoring your test code (this is a key point!!!) as you would your production code you can get very tidy, terse tests. You will typically have a smaller number of tests (one per user scenario rather than one for every class) and they will be organised and named around a particular user scenario (e.g.Features.TeamRegistration.SuccessfulTeamRegistrationScenario) so it should be easy to find the right test to inspect and modify when maintaining code.

Multiple test failures

It’s definitely possible that you can cause multiple tests to fail by changing one thing. I’ll be honest, I don’t really see this as a huge disadvantage and I haven’t experienced this too much in the wild. When it happens it’s generally pretty obvious what the cause is.

Shared code gets tested twice

Yep, but that’s fine because that shared code is an implementation detail - the behaviours that currently use the shared code may diverge in the future and that code may no longer be shared. The fact there is shared code is probably good since it’s a probable sign that you have been diligently refactoring your codebase and removing duplication.

Review of: J.B. Rainsberger - Integrated Tests Are A Scam

This post discusses the talk “Integrated Tests Are A Scam” by J.B. Rainsberger, which was given in November 2013. See my introduction post to get the context behind this post and the other posts I have written in this series.

Overview

There are a couple of good points in this talk and also quite a few points I disagree with.

The tl;dr of this presentation is that J.B. Rainsberger often sees people deal with the problem of having “100% of tests pass, but there is still bugs” be solved by adding integrated tests (note: this isn’t misspelt, he classifies integrated tests differently from integration tests). He likens this to using “asprin that gives you a bigger headache” and says you should instead write isolated tests that test a single object at a time and have matching collaboration and contract tests on each side of the interactions that object has with its peers. He then uses mathematical induction to prove that this will completely test each layer of the application with fast, isolated tests across all logic branches with O(n) tests rather than O(n!).

He says that integrated tests are a scam because they result in you:

designing code more sloppily
making more mistakes
writing fewer tests because it’s harder to write the tests due to the emerging design issues
thus being more likely to have a situation where “100% of tests pass, but there is still bugs”

Integrated test definition

He defines an integrated test as a test that touches “a cluster of objects” and a unit test as a test that just touches a single object and is “isolated”. By this definition and via the premise of his talk every time you write a unit test that touches multiple objects then you aren’t writing a unit test, but you are writing a “self-replicating virus that invades your projects, that threatens to destroy your codebase, that threatens your sanity and your life”. I suspect that some of that sentiment is sensationalised for the purposes of making a good talk (he even has admitted to happily writing tests at a higher level), but the talk is a very popular one and it presents a very one-sided view so I feel it’s important to point that out.

I disagree with that definition of a unit test and I think that strict approach will lead to not only writing more tests than is needed, but tying a lot of your tests to implementation details that make the tests much more fragile and less useful.

Side note: I’ll be using J.B. Rainsberger’s definitions of integrated and unit tests for this post to provide less confusing discussion in the context of his presentation.

Integrated tests and design feedback

The hypothesis that you get less feedback about the design of your software from integrated tests and thus they will result in your application becoming a mess is pretty sketchy in my opinion. Call me naive, but if you are in a team that lets the code get like that with integrated tests then I think that same team will have the same result with more fine-grained tests. If you aren’t serious about refactoring your code (regardless of whether or not you use TDD and have an explicit refactor step in your process) then yeah, it’s going to get bad. In my experience, you still get design feedback when writing your test at a higher level of your application stack with real dependencies underneath (apart from mocking dependencies external to your application) and implementing the code to make it pass.

There is a tieback here to the role of TDD in software design and I think that TDD still helps with design when writing tests that encapsulate more than a single object; it influences the design of your public interface from the level you are testing against and everything underneath is just an implementation detail of that public interface (more on this in other posts in the series).

It’s worth noting that I’m coming from the perspective of a staticly typed language where I can safely create implementation details without needing fine-grained tests to cover every little detail. I can imagine that in situations where you are writing with a dynamic language you might feel the need to make up for a lack of compiler by writing more fine-grained tests.

This is one of the reasons why I have a preference for statically typed languages - the compiler obviates the need for writing mundane tests to check things like property names are correct (althoughsome people still like to write these kinds of tests).

If you are using a dynamic language and your application structure isn’t overly complex (i.e. it’s not structured with layer upon layer upon layer) then you can probably still test from a higher level with a dynamic language without too much pain. For instance, I’ve written Angular JS applications where I’ve tested services from the controller level (with real dependencies) successfully.

It’s also relevant to consider the points that @dhh makes in his post about test-induced design damage and the subsequent TDD is dead video series.

Integrated tests and identifying problems

J.B. Rainsberger says a big problem with integrated tests is that when they fail you have no idea where the problem is. I agree that by glancing at the name of the test that’s broken you might not immediately know which line of code is at fault.

If you structure your tests and code well then usually when there is a test failure in a higher level test you can look at the exception message to get a pretty good idea unless it’s a generic exception like a NullReferenceException. In those scenarios you can spend a little bit more time and look at the stack trace to nail down the offending line of code. This is slower, but I personally think that the trade-off that you get (as discussed throughout this series) it worth this small increase.

Motivation to write integrated tests

J.B. Rainsberger puts forward that the motivation people usually have to write integrated tests is to find bugs that unit tests can’t uncover by testing the interaction between objects. While it is a nice benefit to test the code closer to how it’s executed in production, with real collaborators, it’s not the main reason I write tests that cover multiple objects. I write this style of tests because they allow us to divorce the tests from knowing about implementation details and write them at a much more useful level of abstraction, closer to what the end user cares about. This gives full flexibility to refactor code without breaking tests since the tests can describe user scenarios rather than developer-focussed implementation concerns. It’s appropriate to name drop BDD and ATDD here.

Integrated tests necessitate a combinatorial explosion in the number of tests

Lack of design feedback aside, the main criticism that J.B. Rainsberger has against integrated tests is that to test all pathways through a codebase you need to have a combinatorial explosion of tests (O(n!)). While this is technically true I question the practicality of this argument for a well designed system:

He seems to suggest that you are going to build software that contains a lot of layers and within each layer you will have a lot of branches. While I’m sure there are examples out there like that, most of the applications I’ve been involved with can be architected to be relatively flat.
It’s possible that I just haven’t written code for the right industries and thus haven’t across the kinds of scenarios he is envisaging, but at best it demonstrates that his sweeping statements don’t always apply and you should take a pragmatic approach based on your codebase.
Consider the scenario where you add a test against new functionality against the behaviour of the system from the user’s perspective (e.g. a BDD style test for each acceptance criteria in your user story). In that scenario, then, being naive for a moment, all code that you add could be tested by these higher level tests.
Naivety aside, you will add code that doesn’t directly relate to the acceptance criteria, this might be infrastructure code or defensive programming, or logging etc. and in those cases I think you just need to evaluate how important it is to test that code:
- Sometimes the code is very declarative and obviously wrong or right - in those instances, where there is unlikely to be complex interactions with other parts of the code (null checks being a great example) then I generally don’t think it needs to be tested
- Sometimes it’s common code that will necessarily be tested by any integrated (or UI) tests you do have anyway
- Sometimes it’s code that is complex or important enough to warrant specific, “implementation focussed”, unit tests - add them!
- If such code didn’t warrant a test and later turns out to introduce a bug then that gives you feedback that it wasn’t that obvious afterall and at that point you can introduce a breaking test so it never happens again (before fixing it - that’s the first step you take when you have a bug right?)
- If the above point makes you feel uncomfortable then you should go and look up continuous delivery and work on ensuring your team works towards the capability to deliver code to production at any time so rolling forward with fixes is fast and efficient
- It’s important to keep in mind that the point of testing generally isn’t to get 100% coverage, it’s to give you confidence that your application is going to work - I talk about this more later in the series - I can think of industries where this is probably different (e.g. healthcare, aerospace) so as usual be pragmatic!
There will always be exceptions to the rule, if you find a part of the codebase that is more complex and does require a lower level test to feasibily cover all of the combinations then do it - that doesn’t mean you should write those tests for the whole system though.

Contract and collaboration tests

The part about J.B. Rainsberger’s presentation that I did like was his solution to the “problem”. While I think it’s fairly basic, common-sense advice that a lot of people probably follow I still think it’s good advice.

He describes that, where you have two objects collaborating with each other, you might consider one object to be the “server” and the other the “client” and the server can be tested completely independently of the client since it will simply expose a public API that can be called. In that scenario, he suggests that the following set of tests should be written:

The client should have a “collaboration” test to check that it asks the right questions of the next layer by using an expectation on a mock of the server
The client should also have a set of collaboration tests to check it responds correctly to the possible responses that the server can return (e.g. 0 results, 1 result, a few results, lots of results, throws an exception)
The server should have a “contract” test to check that it tries to answer the question from the client that matches the expectation in the client’s collaboration test
The server should also have a set of contract tests to check that it can reply correctly with the same set of responses tested in the client’s collaboration tests

While I disagree with applying this class-by-class through every layer of your application I think that you can and should still apply this at any point that you do need to make a break between two parts of your code that you want to test independently. This type of approach also works well when testing across separate systems/applications too. When testing across systems it’s worth looking at consumer-driven contracts and in particular at Pact (.NET version).

Unit, integration, subcutaneous, UI, fast, slow, mocks, TDD, isolation and scams… What is this? I don’t even!

As outlined in the first post of my Automated Testing blog series I’ve been on a journey of self reflection and discovery about how best to write, structure and maintain automated tests.

The most confusing and profound realisations that I’ve had relate to how best to cover a codebase in tests and what type and speed those tests should be. The sorts of questions and considerations that come to mind about this are:

Should I be writing unit, subcutaneous, integration, etc. tests to cover a particular piece of code?
What is a unit test anyway? Everyone seems to have a different definition!
How do I get feedback as fast as possible - reducing feedback loops is incredibly important.
How much time/effort are we prepared to spend testing our software and what level of coverage do we need in return?
How do I keep my tests maintainable and how do I reduce the number of tests that break when I need to make a change to the codebase?
How do I make sure that my tests give me the maximum confidence that when the code is shipped to production it will work?
When should I be mocking the database, filesystem etc.
How do I ensure that my application is tested consistently?

In order to answer these questions and more I’ve watched a range of videos and read a number of blog posts from prominent people, spent time experimenting and reflecting on the techniques via the projects I work on (both professionally and with my Open Source Software work) and tried to draw my own conclusions.

There are some notable videos that I’ve come across that, in particular, have helped me with my learning and realisations so I’ve created a series of posts around them (and might add to it over time if I find other posts). I’ve tried to summarise the main points I found interesting from the material as well as injecting my own thoughts and experience where relevant.

There is a great talk by Gary Bernhardt called Boundaries. For completeness, it is worth looking at in relation to the topics discussed in the above articles. I don’t have much to say about this yet (I’m still getting my head around where it fits in) apart from the fact that code that maps input(s) to output(s) without side effects are obviously very easy to test and I’ve found that where I have used immutable value objects in my domain model it has made testing easier.

Summary

I will summarise my current thoughts (this might change over time) by revisiting the questions I posed above:

Should I be writing unit, subcutaneous, integration, etc. tests to cover a particular piece of code?
- Typical consultant answer: it depends. In general I’d say write the fastest possible test you can that gives you the minimum required confidence and bakes in the minimum amount of implementation details.
- I’ve had a lot of luck covering line-of-business web apps with mostly subcutaneous tests against the MVC controllers, with a smattering of unit tests to check conventions and test really complex logic and I typically see how far I can get without writing UI tests, but when I do I test high-value scenarios or complex UIs.
What is a unit test anyway? Everyone seems to have a different definition!
- I like Jimmy Bogard’s definition: “Units of behaviour, isolated from other units of behaviour”
How do I get feedback as fast as possible - reducing feedback loops is incredibly important.
- Follow Jimmy’s advice and focus on writing as many tests that are as fast as possible rather than worrying about whether a test is a unit test or integration test.
- Be pragmmatic though, you might get adequate speed, but a higher level of confidence by integrating your tests with the database for instance (this has worked well for me)
How much time/effort are we prepared to spend testing our software and what level of coverage do we need in return?
- I think it depends on the application - the product owner, users and business in general will all have different tolerances for risk of something going wrong. Do the minimum amount that’s needed to get the amount of confidence that is required.
- In general I try and following the mantra of “challenge yourself to start simple then inspect and adapt” (thanks Jess for helping refine that). Start off with the simplest testing approach that will work and if you find you are spending too long writing tests or the tests don’t give you the right confidence then adjust from there.
How do I keep my tests maintainable and how do I reduce the number of tests that break when I need to make a change to the codebase?
- Focus on removing implementation details from tests. Be comfortable testing multiple classes in a single test (use your production DI container!).
- Structure the tests according to user behaviour - they are less likely to have implementation details and they form better documentation of the system.
How do I make sure that my tests give me the maximum confidence that when the code is shipped to production it will work?
- Reduce the amount of mocking you use to the bare minimum - hopefully just things external to your application so that you are testing production-like code paths.
- Subcutaneous tests are a very good middle ground between low-level implementation-focused unit tests and slow and brittle UI tests.
When should I be mocking the database, filesystem etc.
- When you need the speed and are happy to forgo the lower confidence.
- Also, if they are external to your application or not completely under your application’s control e.g. a database that is touched by multiple apps and your app doesn’t run migrations on it and control the schema.
How do I ensure that my application is tested consistently?
- Come up with a testing strategy and stick with it. Adjust it over time as you learn new things though.
- Don’t be afraid to use different styles of test as appropriate - e.g. the bulk of tests might be subcutaneous, but you might decide to write lower level unit tests for complex logic.

In closing, I wanted to show a couple of quotes that I think are relevant:

Fellow Readifarian, Kahne Raja recently said this on an internal Yammer discussion and I really identify with it:

“We should think about our test projects like we think about our solution projects. They involve complex design patterns and regular refactoring.”

Another Readifarian, Pawel Pabich , made the important point that:

“[The tests you write] depend[s] on the app you are writing. [A] CRUD Web app might require different tests than a calculator.”

I also like this quote from Kent Beck:

“I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence.”

@robdmoore

Tag Cloud

Posts - Page 3 of 22

Whitepaper: Managing Database Schemas in a Continuous Delivery World

Review of: Jimmy Bogard - Holistic Testing

Overview

Code coverage and shipping code

Types of tests

Mocks

Container-driven unit tests

Isolating the database

Subcutaneous testing

When to use subcutaneous tests

Review of: Ian Cooper - TDD, where did it all go wrong

Overview

Testing behaviours rather than implementations

TDD and refactoring

Ports and adapters

Mocking

Problems with higher level unit tests

Complex implementation

Combinatorial explosion

Complex tests

Multiple test failures

Shared code gets tested twice

Review of: J.B. Rainsberger - Integrated Tests Are A Scam

Overview

Integrated test definition

Integrated tests and design feedback

Integrated tests and identifying problems

Motivation to write integrated tests

Integrated tests necessitate a combinatorial explosion in the number of tests

Contract and collaboration tests

Unit, integration, subcutaneous, UI, fast, slow, mocks, TDD, isolation and scams… What is this? I don’t even!

Summary