MsDeploy to Azure Web App with Application Insights extension enabled when deleting additional destination files

When performing an MsDeploy to an Azure Web App and you have the App Insights extension enabled you may find something interesting happens if you use the option to delete additional files on the destination that don’t appear in the source. If you look at the deployment log you may see something like this:

2017-01-30T07:29:27.5515545Z Info: Deleting file ({sitename}\ApplicationInsights.config).
2017-01-30T07:29:27.5515545Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.2.2.0\Microsoft.ApplicationInsights.2.2.0.nupkg).
2017-01-30T07:29:27.5515545Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.2.2.0).
2017-01-30T07:29:27.5515545Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.Agent.Intercept.2.0.6\Microsoft.ApplicationInsights.Agent.Intercept.2.0.6.nupkg).
2017-01-30T07:29:27.5515545Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.Agent.Intercept.2.0.6).
2017-01-30T07:29:27.5515545Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.Azure.WebSites.2.2.0\Microsoft.ApplicationInsights.Azure.WebSites.2.2.0.nupkg).
2017-01-30T07:29:27.5515545Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.Azure.WebSites.2.2.0).
2017-01-30T07:29:27.5525645Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.DependencyCollector.2.2.0\Microsoft.ApplicationInsights.DependencyCollector.2.2.0.nupkg).
2017-01-30T07:29:27.5525645Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.DependencyCollector.2.2.0).
2017-01-30T07:29:27.5525645Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.PerfCounterCollector.2.2.0\Microsoft.ApplicationInsights.PerfCounterCollector.2.2.0.nupkg).
2017-01-30T07:29:27.5525645Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.PerfCounterCollector.2.2.0).
2017-01-30T07:29:27.5525645Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.Web.2.2.0\Microsoft.ApplicationInsights.Web.2.2.0.nupkg).
2017-01-30T07:29:27.5525645Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.Web.2.2.0).
2017-01-30T07:29:27.5525645Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.WindowsServer.2.2.0\Microsoft.ApplicationInsights.WindowsServer.2.2.0.nupkg).
2017-01-30T07:29:27.5525645Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.WindowsServer.2.2.0).
2017-01-30T07:29:27.5525645Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel.2.2.0\Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel.2.2.0.nupkg).
2017-01-30T07:29:27.5525645Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel.2.2.0).
2017-01-30T07:29:27.5525645Z Info: Deleting file ({sitename}\App_Data\packages\Microsoft.Web.Infrastructure.1.0.0.0\Microsoft.Web.Infrastructure.1.0.0.0.nupkg).
2017-01-30T07:29:27.5525645Z Info: Deleting directory ({sitename}\App_Data\packages\Microsoft.Web.Infrastructure.1.0.0.0).
2017-01-30T07:29:27.5535680Z Info: Deleting directory ({sitename}\App_Data\packages).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.AI.Agent.Intercept.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.AI.DependencyCollector.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.AI.HttpModule.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.AI.PerfCounterCollector.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.AI.ServerTelemetryChannel.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.AI.Web.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.AI.WindowsServer.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.ApplicationInsights.AzureWebSites.dll).
2017-01-30T07:29:27.5535680Z Info: Deleting file ({sitename}\bin\Microsoft.ApplicationInsights.dll).

The cool thing about this is it gives you an indication of what the extension actually does. The fact there is an App_Data/packages folder with what is clearly unpacked NuGet packages tells us that the extension is installing a NuGet package into your site for you. That makes a lot of sense given you don’t need to install the extension if you installed the NuGet package yourself (I generally don’t bother because I don’t need App Insights locally and see it as a deployment concern so I like App Service adding it for me :)).

Setting the MsDeploy option to delete extraneous files is very useful so it’s not something I want to simply turn off. However, a knowledge of MsDeploy gives us some indication as to a possible solution. In this case we can make use of the skip option to specify that MSDeploy should ignore the above affected files.

Putting it all together, if you specify the following rules in your msdeploy.exe call then you should have success:

-skip:objectname='filePath',absolutepath='ApplicationInsights.config' -skip:objectname='dirPath',absolutepath='App_Data\\packages\\*.*' -skip:objectname='filePath',absolutepath='bin\\Microsoft.AI.*.dll'  -skip:objectname='filePath',absolutepath='bin\\Microsoft.ApplicationInsights.*.dll'

After doing that your deployment log should look something like this:

2017-01-30T08:10:26.7172425Z Info: Object filePath ({sitename}\ApplicationInsights.config) skipped due to skip directive 'CommandLineSkipDirective 1'.
2017-01-30T08:10:26.7182428Z Info: Object dirPath ({sitename}\App_Data\packages) skipped due to skip directive 'CommandLineSkipDirective 2'.
2017-01-30T08:10:26.7192429Z Info: Object filePath ({sitename}\bin\Microsoft.AI.Agent.Intercept.dll) skipped due to skip directive 'CommandLineSkipDirective 3'.
2017-01-30T08:10:26.7192429Z Info: Object filePath ({sitename}\bin\Microsoft.AI.DependencyCollector.dll) skipped due to skip directive 'CommandLineSkipDirective 3'.
2017-01-30T08:10:26.7192429Z Info: Object filePath ({sitename}\bin\Microsoft.AI.HttpModule.dll) skipped due to skip directive 'CommandLineSkipDirective 3'.
2017-01-30T08:10:26.7192429Z Info: Object filePath ({sitename}\bin\Microsoft.AI.PerfCounterCollector.dll) skipped due to skip directive 'CommandLineSkipDirective 3'.
2017-01-30T08:10:26.7192429Z Info: Object filePath ({sitename}\bin\Microsoft.AI.ServerTelemetryChannel.dll) skipped due to skip directive 'CommandLineSkipDirective 3'.
2017-01-30T08:10:26.7202428Z Info: Object filePath ({sitename}\bin\Microsoft.AI.Web.dll) skipped due to skip directive 'CommandLineSkipDirective 3'.
2017-01-30T08:10:26.7202428Z Info: Object filePath ({sitename}\bin\Microsoft.AI.WindowsServer.dll) skipped due to skip directive 'CommandLineSkipDirective 3'.
2017-01-30T08:10:26.7202428Z Info: Object filePath ({sitename}\bin\Microsoft.ApplicationInsights.AzureWebSites.dll) skipped due to skip directive 'CommandLineSkipDirective 4'.
2017-01-30T08:10:26.7202428Z Info: Object filePath ({sitename}\bin\Microsoft.ApplicationInsights.dll) skipped due to skip directive 'CommandLineSkipDirective 4'.

What to do when accidentally deleting App Insights files

If you find yourself in the position that you have accidentally deleted the App Insights files you simply need to delete the App Insights extension and then re-add it and it should work again.

Adding the SDK to your application

If you eventually end up adding the App Insights SDK to your application then take note that you will need to make sure the version of the extension and the version of the SDK DLLs match otherwise you’ll get a version mismatch exception on app startup.

You still need to install the exception because the SDK alone doesn’t collect all information.

Whitepaper: Managing Database Schemas in a Continuous Delivery World

A whitepaper I wrote for my employer, Readify, just got published. Feel free to check it out. I’ve included the abstract below.

One of the trickier technical problems to address when moving to a continuous delivery development model is managing database schema changes. It’s much harder to to roll back or roll forward database changes compared to software changes since by definition it has state. Typically, organisations address this problem by having database administrators (DBAs) manually apply changes so they can manually correct any problems, but this has the downside of providing a bottleneck to deploying changes to production and also introduces human error as a factor.

A large part of continuous delivery involves the setup of a largely automated deployment pipeline that increases confidence and reduces risk by ensuring that software changes are deployed consistently to each environment (e.g. dev, test, prod).

To fit in with that model it’s important to automate database changes so that they are applied automatically and consistently to each environment thus increasing the likelihood of problems being found early and reducing the risk associated with database changes.

This report outlines an approach to managing database schema changes that is compatible with a continuous delivery development model, the reasons why the approach is important and some of the considerations that need to be made when taking this approach.

The approaches discussed in this document aren’t specific to continuous delivery and in fact should be considered regardless of your development model.

Review of: J.B. Rainsberger – Integrated Tests Are A Scam

This post discusses the talk “Integrated Tests Are A Scam” by J.B. Rainsberger, which was given in November 2013. See my introduction post to get the context behind this post and the other posts I have written in this series.

Overview

There are a couple of good points in this talk and also quite a few points I disagree with.

The tl;dr of this presentation is that J.B. Rainsberger often sees people deal with the problem of having “100% of tests pass, but there is still bugs” be solved by adding integrated tests (note: this isn’t misspelt, he classifies integrated tests differently from integration tests). He likens this to using “asprin that gives you a bigger headache” and says you should instead write isolated tests that test a single object at a time and have matching collaboration and contract tests on each side of the interactions that object has with its peers. He then uses mathematical induction to prove that this will completely test each layer of the application with fast, isolated tests across all logic branches with O(n) tests rather than O(n!).

He says that integrated tests are a scam because they result in you:

  • designing code more sloppily
  • making more mistakes
  • writing fewer tests because it’s harder to write the tests due to the emerging design issues
  • thus being more likely to have a situation where “100% of tests pass, but there is still bugs”

Integrated test definition

He defines an integrated test as a test that touches “a cluster of objects” and a unit test as a test that just touches a single object and is “isolated”. By this definition and via the premise of his talk every time you write a unit test that touches multiple objects then you aren’t writing a unit test, but you are writing a “self-replicating virus that invades your projects, that threatens to destroy your codebase, that threatens your sanity and your life”. I suspect that some of that sentiment is sensationalised for the purposes of making a good talk (he even has admitted to happily writing tests at a higher level), but the talk is a very popular one and it presents a very one-sided view so I feel it’s important to point that out.

I disagree with that definition of a unit test and I think that strict approach will lead to not only writing more tests than is needed, but tying a lot of your tests to implementation details that make the tests much more fragile and less useful.

Side note: I’ll be using J.B. Rainsberger’s definitions of integrated and unit tests for this post to provide less confusing discussion in the context of his presentation.

Integrated tests and design feedback

The hypothesis that you get less feedback about the design of your software from integrated tests and thus they will result in your application becoming a mess is pretty sketchy in my opinion. Call me naive, but if you are in a team that lets the code get like that with integrated tests then I think that same team will have the same result with more fine-grained tests. If you aren’t serious about refactoring your code (regardless of whether or not you use TDD and have an explicit refactor step in your process) then yeah, it’s going to get bad. In my experience, you still get design feedback when writing your test at a higher level of your application stack with real dependencies underneath (apart from mocking dependencies external to your application) and implementing the code to make it pass.

There is a tieback here to the role of TDD in software design and I think that TDD still helps with design when writing tests that encapsulate more than a single object; it influences the design of your public interface from the level you are testing against and everything underneath is just an implementation detail of that public interface (more on this in other posts in the series).

It’s worth noting that I’m coming from the perspective of a staticly typed language where I can safely create implementation details without needing fine-grained tests to cover every little detail. I can imagine that in situations where you are writing with a dynamic language you might feel the need to make up for a lack of compiler by writing more fine-grained tests.

This is one of the reasons why I have a preference for statically typed languages – the compiler obviates the need for writing mundane tests to check things like property names are correct (althoughsome people still like to write these kinds of tests).

If you are using a dynamic language and your application structure isn’t overly complex (i.e. it’s not structured with layer upon layer upon layer) then you can probably still test from a higher level with a dynamic language without too much pain. For instance, I’ve written Angular JS applications where I’ve tested services from the controller level (with real dependencies) successfully.

It’s also relevant to consider the points that @dhh makes in his post about test-induced design damage and the subsequent TDD is dead video series.

Integrated tests and identifying problems

J.B. Rainsberger says a big problem with integrated tests is that when they fail you have no idea where the problem is. I agree that by glancing at the name of the test that’s broken you might not immediately know which line of code is at fault.

If you structure your tests and code well then usually when there is a test failure in a higher level test you can look at the exception message to get a pretty good idea unless it’s a generic exception like a NullReferenceException. In those scenarios you can spend a little bit more time and look at the stack trace to nail down the offending line of code. This is slower, but I personally think that the trade-off that you get (as discussed throughout this series) it worth this small increase.

Motivation to write integrated tests

J.B. Rainsberger puts forward that the motivation people usually have to write integrated tests is to find bugs that unit tests can’t uncover by testing the interaction between objects. While it is a nice benefit to test the code closer to how it’s executed in production, with real collaborators, it’s not the main reason I write tests that cover multiple objects. I write this style of tests because they allow us to divorce the tests from knowing about implementation details and write them at a much more useful level of abstraction, closer to what the end user cares about. This gives full flexibility to refactor code without breaking tests since the tests can describe user scenarios rather than developer-focussed implementation concerns. It’s appropriate to name drop BDD and ATDD here.

Integrated tests necessitate a combinatorial explosion in the number of tests

Lack of design feedback aside, the main criticism that J.B. Rainsberger has against integrated tests is that to test all pathways through a codebase you need to have a combinatorial explosion of tests (O(n!)). While this is technically true I question the practicality of this argument for a well designed system:

  • He seems to suggest that you are going to build software that contains a lot of layers and within each layer you will have a lot of branches. While I’m sure there are examples out there like that, most of the applications I’ve been involved with can be architected to be relatively flat.
  • It’s possible that I just haven’t written code for the right industries and thus haven’t across the kinds of scenarios he is envisaging, but at best it demonstrates that his sweeping statements don’t always apply and you should take a pragmatic approach based on your codebase.
  • Consider the scenario where you add a test against new functionality against the behaviour of the system from the user’s perspective (e.g. a BDD style test for each acceptance criteria in your user story). In that scenario, then, being naive for a moment, all code that you add could be tested by these higher level tests.
  • Naivety aside, you will add code that doesn’t directly relate to the acceptance criteria, this might be infrastructure code or defensive programming, or logging etc. and in those cases I think you just need to evaluate how important it is to test that code:
    • Sometimes the code is very declarative and obviously wrong or right – in those instances, where there is unlikely to be complex interactions with other parts of the code (null checks being a great example) then I generally don’t think it needs to be tested
    • Sometimes it’s common code that will necessarily be tested by any integrated (or UI) tests you do have anyway
    • Sometimes it’s code that is complex or important enough to warrant specific, “implementation focussed”, unit tests – add them!
    • If such code didn’t warrant a test and later turns out to introduce a bug then that gives you feedback that it wasn’t that obvious afterall and at that point you can introduce a breaking test so it never happens again (before fixing it – that’s the first step you take when you have a bug right?)
    • If the above point makes you feel uncomfortable then you should go and look up continuous delivery and work on ensuring your team works towards the capability to deliver code to production at any time so rolling forward with fixes is fast and efficient
    • It’s important to keep in mind that the point of testing generally isn’t to get 100% coverage, it’s to give you confidence that your application is going to work – I talk about this more later in the series – I can think of industries where this is probably different (e.g. healthcare, aerospace) so as usual be pragmatic!
  • There will always be exceptions to the rule, if you find a part of the codebase that is more complex and does require a lower level test to feasibily cover all of the combinations then do it – that doesn’t mean you should write those tests for the whole system though.

Contract and collaboration tests

The part about J.B. Rainsberger’s presentation that I did like was his solution to the “problem”. While I think it’s fairly basic, common-sense advice that a lot of people probably follow I still think it’s good advice.

He describes that, where you have two objects collaborating with each other, you might consider one object to be the “server” and the other the “client” and the server can be tested completely independently of the client since it will simply expose a public API that can be called. In that scenario, he suggests that the following set of tests should be written:

  • The client should have a “collaboration” test to check that it asks the right questions of the next layer by using an expectation on a mock of the server
  • The client should also have a set of collaboration tests to check it responds correctly to the possible responses that the server can return (e.g. 0 results, 1 result, a few results, lots of results, throws an exception)
  • The server should have a “contract” test to check that it tries to answer the question from the client that matches the expectation in the client’s collaboration test
  • The server should also have a set of contract tests to check that it can reply correctly with the same set of responses tested in the client’s collaboration tests

While I disagree with applying this class-by-class through every layer of your application I think that you can and should still apply this at any point that you do need to make a break between two parts of your code that you want to test independently. This type of approach also works well when testing across separate systems/applications too. When testing across systems it’s worth looking at consumer-driven contracts and in particular at Pact (.NET version).

My stance on Azure Worker Roles

tl;dr 99% of the time Worker Role is not the right solution. Read on for more info.

Worker Role Deployments

I quite often get asked by people about the best way to deploy Worker Roles because it is a pain – as an Azure Cloud Service the deployment time of a Worker Role is 8-15+ minutes. In the age of continuous delivery and short feedback loops this is unacceptable (as I have said all along).

On the surface though, Worker Roles are the most appropriate and robust way to deploy heavy background processing workloads in Azure. So what do we do?

Web Jobs

The advice I generally give people when deploying websites to Azure is to use Azure Web Sites unless there is something that requires them to use Web Roles (and use Virtual Machines as a last resort). That way you are left with the best possible development, deployment, debugging and support experience possible for your application.

Now that Web Jobs have been released for a while and have a level of maturity and stability I have been giving the same sort of advice when it comes to background processing: if you have a workload that can run on the Azure Web Sites platform (e.g. doesn’t need registry/COM+/GDI+/elevated privileges/custom software installed/mounted drives/Virtual Network/custom certificates etc.) and it doesn’t have intensive CPU or memory resource usage then use Web Jobs.

I should note that when deploying Web Jobs you can deploy them automatically using the WebJobsVs Visual Studio extension.

As a side note: some of my colleagues at Readify have recently started using Web Jobs as a platform for deploying Microservices in asynchronous messaging based systems. It’s actually quite a nice combination because you can put any configuration / management / monitoring information associated with the micro-service in the web site portion of the deployment and it’s intrinsically linked to the Web Job in both the source code and the deployment.

Worker Roles

If you are in a situation where you have an intense workload, you need to scale the workload independently  of your Web Sites instances or your workload isn’t supported by the Azure Web Sites platform (and thus can be run as a Web Job) then you need to start looking at Worker Roles or some other form of background processing.

Treat Worker Roles as infrastructure

One thing that I’ve been trying to push for a number of years now (particularly via my AzureWebFarm and AzureWebFarm.OctopusDeploy projects) is for people to think of Cloud Services deployments as infrastructure rather than applications.

With that mindset shift, Cloud Services becomes amazing rather than a deployment pain:

  • Within 8-15+ minutes a number of customised, RDP-accessible, Virtual Machines are being provisioned for you on a static IP address and those machines can be scaled up or down at any time and they have health monitoring and diagnostics capabilities built-in as well as a powerful load balancer and ability to arbitrarily install software or perform configurations with elevated privileges!
  • To reiterate: waiting 8-15+ minutes for a VM to be provisioned is amazing; waiting 8-15+ minutes for the latest version of your software application to be deployed is unacceptably slow!

By treating Cloud Services as stateless, scalable infrastructure you will rarely perform deployments and the deployment time is then a non-issue – you will only perform deployments when scaling up or rolling out infrastructure updates (which should be a rare event and if it rolls out seamlessly then it doesn’t matter how long it takes).

Advantages of Web/Worker Roles as infrastructure

  • As above, slow deployments don’t matter since they are rare events that should be able to happen seamlessly without taking out the applications hosted on them.
  • As above, you can use all of the capabilities available in Cloud Services.
  • Your applications don’t have to have a separate Azure project in them making the Visual Studio solution simpler / load faster etc.
  • Your applications don’t have any Azure-specific code in them (e.g. CloudConfiguationManager, RoleEnvironment, RoleEntryPoint, etc.) anymore
    • This makes your apps simpler and also means that you aren’t coding anything in them that indicates how/where they should be deployed – this is important and how it should be!
    • It also means you can deploy the same code on-premise and in Azure seamlessly and easily

How do I deploy a background processing workload to Worker Role as infrastructure?

So how does this work you might ask? Well, apart from rolling your own code in the Worker Role to detect, deploy and run your application(s) (say, from blob storage) you have two main options that I know of (both of which are open source projects I own along with Matt Davies):

  • AzureWebFarm and its background worker functionality
    • This would see you deploying the background work as part of MSDeploying a web application and it works quite similar to (but admittedly probably less robust than) Web Jobs – this is suitable for light workloads
  • AzureWebFarm.OctopusDeploy and using OctopusDeploy to deploy a Windows Service
    • In general I recommend using Topshelf to develop Windows Services because it allows a nicer development experience (single console app project that you can F5) and deployment experience (you pass an install argument to install it)
    • You should be able to deploy heavyweight workloads using this approach (just make sure your role size is suitable)

The thing to note about both of these approaches is that you are actually using Web Roles, not Worker Roles! This is fine because there isn’t actually any difference between them apart from the fact that Web Roles have IIS installed and configured. If you don’t want anyone to access the servers over HTTP because they are only used for background processing then simply don’t expose a public endpoint.

So, when should I actually use Worker Roles (aka you said they aren’t applicable 99% of the time – what about the other 1%)?

OK, so there is definitely some situations I can think of and have come across before occasionally that warrant the application actually being coded as a Worker Role – remember to be pragmmatic and use the right tool for the job! Here are some examples (but it’s by no means exhaustive):

  • You need the role to restart if there are any uncaught exceptions
  • You need the ability to control the server as part of the processing – e.g. request the server start / stop
  • You want to connect to internal endpoints in a cloud service deployment or do other complex things that require you to use RoleEnvironment
  • There isn’t really an application-component (or it’s tiny) – e.g. you need to install a custom application when the role starts up and then you invoke that application in some way

What about Virtual Machines?

Sometimes Cloud Services aren’t going to work either – in a scenario where you need persistent storage and can’t code your background processing code to be stateless via RoleEntryPoint then you might need to consider standing up one or more Virtual Machines. If you can avoid this at all then I highly recommend it since you then need to maintain the VMs rather than using a managed service.

Other workloads

This post is targeted at the types of background processing workloads you would likely deploy to a Worker Role. There are other background processing technologies in Azure that I have deliberately not covered in this post such as Hadoop.

TeamCity deployment pipeline (part 3: using OctopusDeploy for deployments)

This post outlines how using OctopusDeploy for deployments can fit into a TeamCity continuous delivery deployment pipeline.

Maintainable, large-scale continuous delivery with TeamCity series

This post is part of a blog series jointly written by myself and Matt Davies called Maintainable, large-scale continuous delivery with TeamCity:

  1. Intro
  2. TeamCity deployment pipeline
  3. Deploying Web Applications
    • MsDeploy (onprem and Azure Web Sites)
    • OctopusDeploy (nuget)
    • Git push (Windows Azure Web Sites)
  4. Deploying Windows Services
    • MsDeploy
    • OctopusDeploy
    • Git push (Windows Azure Web Sites Web Jobs)
  5. Deploying Windows Azure Cloud Services
    • OctopusDeploy
    • PowerShell
  6. How to choose your deployment technology

Using another tool for deployments

If you can have a single tool to create your deployment pipeline and include continuous integration as well as deployments then there are obvious advantages in terms of simplicity of configuration and management (single set of project definitions, user accounts and permissions, single UI to learn, etc.). This is one of the reasons we created this blog series; we loved how powerful TeamCity is out of the box and wanted to expose that awesomeness so other people could experience what we were.

Lately, we have also been experimenting with combining TeamCity with another tool to take care of the deployments: OctopusDeploy. There are a number of reasons we have been looking at it:

  • Curiosity; OctopusDeploy is getting a lot of attention in the .NET community so it’s hard not to notice it – it’s worth looking at it to see what it does
  • If you are coordinating complex deployments then OctopusDeploy takes care of managing that complexity and the specification of the deployment process in a manageable way (that would start to get complex with TeamCity)
    • For instance, if you need to perform database migrations then deploy multiple websites and a background worker together then OctopusDeploy makes this easy
  • Visualising the deployment pipeline is much easier in OctopusDeploy, which is great when you are trying to get non-technical product owners involved in deployments
  • It gives you a lot more flexibility around performing deploy-time actions out-of-the-box and makes it easier to do build once packages
    • It is possible to do the same things with MsDeploy, but it is more complex to do
  • It has great documentation
  • It comes with plugins that make using it with TeamCity a breeze

We wouldn’t always use OctopusDeploy exclusively, but it’s definitely a tool worth having a good understanding of to use judiciously because it can make your life a lot easier.

Generating the package

OctopusDeploy uses the NuGet package format to wrap up deployments; there are a number of ways you can generate the NuGet package in TeamCity:

  • OctoPack (if you install OctoPack into your project then it’s dead easy to get the NuGet package – you can install the plugin to invoke it or pass through /p:RunOctoPack=true to MSBuild when you build your solution/project if using .NET)
  • TeamCity NuGet Pack step (if you are using a custom .nuspec file then it’s easy to package that up and automatically get it as an artifact by using TeamCity’s NuGet Pack step)
  • PowerShell or another scripting language (if you need more flexibility you can create a custom script to run that will package up the files)

Publishing the package to a NuGet feed so OctopusDeploy can access it

In order for OctopusDeploy to access the deployment packages they need to be published to a NuGet feed that the OctopusDeploy server can access. You have a number of options of how to do this from TeamCity:

  • Publish the package to TeamCity’s NuGet feed (this is easy – simply include the .nupkg file as an artifact and it will automatically publish to it’s feed)
  • Publish the package to OctopusDeploy’s internal NuGet server
  • Publish the package to some other NuGet server you set up that OctopusDeploy can access

Automating releases and deployments

In order to create releases and trigger deployments of those releases from TeamCity there are a number of options:

Announcing AzureWebFarm.OctopusDeploy

I’m very proud to announce the public 1.0 release of a new project that I’ve been working on with Matt Davies over the last couple of weeks – AzureWebFarm.OctopusDeploy.

This project allows you to easily create an infinitely-scalable farm of IIS 8 / Windows Server 2012 web servers using Windows Azure Web Roles that are deployed to by an OctopusDeploy server.

If you haven’t used OctopusDeploy before then head over to the homepage and check it out now because it’s AMAZING.

The installation instructions are on the AzureWebFarm.OctopusDeploy homepage (including a screencast!), but in short it amounts to:

  1. Configure a standard Web Role project in Visual Studio
  2. Install-Package AzureWebFarm.OctopusDeploy
  3. Configure 4 cloud service variables – OctopusServerOctopusApiKeyTentacleEnvironment and TentacleRole
  4. Deploy to Azure and watch the magic happen!

We also have a really cool logo that a friend of ours, Aoife Doyle, drew and graciously let us use!AzureWebFarm.OctopusDeploy logo

 

It’s been a hell of a lot of fun developing this project as it’s not only been very technically challenging, but the end result is just plain cool! In particular the install.ps1 file for the NuGet package was very fun to write and results in a seamless installation experience!

Also, a big thanks to Nicholas Blumhardt, who gave me some assistance for a few difficulties I had with Octopus and implemented a new feature I needed really quickly!

Using Pull Requests for commercial/private/proprietary development

Developers who are familiar with open source will likely be aware of pull requests, which were created and popularised by GitHub as a way of providing some automation, visibility and social interaction around merging software changes. It replaces the old school notion of emailing patch files to each other as well as providing more visibility and interaction over pushing to the same branch (aka mainline development).

This is a post I’ve been meaning to write for a while. As a Senior Consultant for Readify I find myself spending a lot of time mentoring teams and as a part of that I’m constantly paying attention to trends I notice and picking up / experimenting with techniques and processes that I can introduce to teams to get positive outcomes (improved efficiency, quality, maintainability, collaboration etc.). In my experiences so far I’ve found that the single most effective change I have introduced to a software team that has improved the quality of software output is to introduce pull requests. It also has a positive effect on other things like improved collaboration and collective code ownership.

I can’t claim to have invented this idea – I got the idea from listening to the experiences that my fellow consultants Graeme Foster and Jake Ginnivan had and there are certainly examples of companies using it, not the least of which is GitHub itself.

What are pull requests?

Essentially, a pull request allows a developer to make some commits, push those commits to a branch that they have access to and create a request to a repository that they don’t necessarily have access (but also could) to have their commit(s) merged in. Depending on the software that you are using, a pull request will generally be represented as a diff, a list of the commits and a dialogue of comments both against the pull request itself and against individual lines of code in the diff.

Why are pull requests useful for open source?

Pull Requests are awesome for open source because they allow random third parties to submit code changes easily and effectively to projects that they don’t usually have commit access to and for project maintainers to have a conversation with the contributor to get the changes in a state where they can be merged.

This lowers the barrier of entry to both maintainers and contributors, thus making it easier for everyone involved :).

Why are pull requests useful for commercial development?

On first thought you might think that commercial development is significantly different from open source since the people submitting code changes are on the same team and there will be existing code review processes in place to deal with code quality.

Based on my experiences and other people I know there are definitely a range of advantages to using pull requests for commercial development (in no particular order):

  • If you have enough developers and a big enough project/product then you might actually have a similar setup to an open source project in that the product might have a “core team” that look after the product and maintaining standards and consistency and other developers in the company can then submit pull requests to be reviewed by the core team
  • Pull requests give a level playing field to the whole team – it encourages more junior or shy developers to “safely” review and comment on commits by more senior developers since it takes away the intimidation of doing it in person
  • It provides a platform for new developers to get up-skilled quicker by providing:
    • Easy to digest, focussed examples of how to implement certain features that they can browse easily without having to ask how to find them
    • Confidence to raise pull requests without the stress of needing to know if the code is OK
    • Line-by-line comments to help them identify what they need to change to get their code “up to scratch”
    • This has a positive effect on the new developer themselves as well as the team since the learning is founded in practical application (which in my view is always the most efficient way to learn) and the person potentially needs less time from the team to get up to speed (thus having a smaller “burden”)
  • In much the same way as it helps with up-skilling it helps with learning for more junior developers
  • It improves the safety of making changes to the codebase by ensuring that changes are looked at by at least one person before being merged into mainline
  • It improves the consistency/maintainability and quality of the codebase by ensuring all changes are reviewed by at least one person
  • It makes it more likely that changes that would normally pass code review because they were minor enough that it didn’t warrant rejecting a code review / going back and fixing what was already there are more likely to get fixed before being merged in
  • There is tooling out there (e.g. TeamCity’s pull request integration) that can integrate with pull requests to provide assurances that the pull request is OK to merge (and potentially automatically merge it for you)
  • The ability to have threaded comments on individual lines of code makes it really easy to have a contextual conversation about parts of the code and arrive at better designs because of it
  • Despite what code review processes you might have in place I suspect that most of the time it’s more likely code will get reviewed using pull requests because in my experience code reviews don’t always happen regardless of what development processes exist, whereas a pull request would always be reviewed before merging
  • Pull requests are pull-based rather than push-based i.e. the reviewer can review the code when they are not in the middle of something so there is less context-switching overhead (note: important/blocking pull requests might be a reason to interrupt someone still, but in my experience those pull requests are not the majority)
  • If you are working with developers across multiple timezones then the above point means that code reviews are easier to perform since they are pull-based
  • You can use pull requests to raise work-in-progress for early feedback (reducing feedback cycles :)) or for putting up the result of spikes for team comment
  • If approached in the right way then it helps teams with collective code ownership because there is a lot of back-and-forth collaboration over the code and everyone in the team is free to review any pull request they are interested in

As you can see – it’s a pretty convincing list of points and this summarises pretty well why pull requests are the most effective change I’ve made to a development team that has affected code quality.

So, nothings perfect; what are the downsides/gotchas?

There are only a few I’ve noticed so far:

  • If there is a lot of churn on the codebase from a lot of disparate teams/developers with potentially conflicting, high-priority tasks and there is no “core team” looking after the project then there is a tendency for developers to:
    • Work with other developers in their team to quickly merge the pull request without much inspection
    • Worse still – developers might just merge their own pull requests or bypass them completely and push directly to the branch
    • Acknowledge the review comments, but claim that their work is too high priority to not merge straight away and never get around to making the changes
  • If there is someone in the team that is particularly picky then it’s possible that team members might try and avoid that person reviewing their pull requests by asking someone else to review it
  • If there are a lot of changes to be made based on review comments it can sometimes get tricky to keep track of which ones have and haven’t been applied and/or the pull request could drag on for a long time while the developer completes the changes – there is a balance here between merging in the potentially important work in the pull request and spending time fixing potentially minor problems
  • If pull requests are too big they are harder to review and there will be less review comments as a result and things might get missed (this would be a problem without pull requests, but pull requests simply highlight the problem)
  • The diff UI on GitHub and Bitbucket frankly suck compared to local diffing tools. Stash’s pull request diff UI is actually really good though; this does make it a bit more difficult for bigger PRs (but then it’s still easy to pull and review the changes locally so you can get around this limitation)
  • Comments on controversial code changes (or sometimes not so controversial changes) can sometimes get a bit out of hand with more comments than lines of code changed in the PR – at this point it’s clear that the back-and-forth commenting has taken the place of what should be an in-person (or if you are remote video) conversation next to a whiteboard
  • You end up with a lot of local and remote branches that clutter things up, but there are techniques to deal with it

These behaviours are generally unhealthy and are likely to be a problem if you don’t have pull requests anyway (and in fact, pull requests simply serve to visualise the problems!) and should be addressed as part of the general continuous improvement that your team should be undergoing (e.g. raise the problems in retrospectives or to management if serious enough). Hopefully, you are in a situation with a self-organising team that will generally do the right things and get on board with the change :).

Pull requests and continuous delivery

I’m extremelly passionate about Continuous Delivery and as such I have been in favour of a mainline development approach over feature branches for quite a while. Thus, on first hearing about using pull requests I was initially wary because I mistakenly thought it was a manifestation of feature branches. What I’ve since learnt is it definitely doesn’t have to be, and in my opinion shouldn’t be, about creating feature branches.

Instead, the same mindset that you bring to mainline development should be taken to pull requests and the same techniques to enable merging of partially complete features in a way that keeps the code both constantly production-ready and continuously integrated should be used. The way I see it developers need to have a mindset of looking at a pull request as being “a set of changes I want to merge to master” rather than “a complete feature ready to be integrated”. This helps keep pull requests small, easy to review and ensures all developers are regularly integrating all changes against the mainline.

Introducing pull requests to the deployment pipeline adds another step that increases the confidence in your production deployments (the whole point of a deployment pipeline) as well as helping make sure that code in the mainline is always production-ready since the reviewer can (and should) review the changeset for it’s likely effect on production as well as the usual things you would review.

Side note: I love this git branching model description.

Tips

  • Keep PRs small (I think I’ve explained why above well enough)
  • Keep PRs focussed – a PR should contain one logical set of changes – it makes it easier to review and means if there is a problem with one set of changes that it doesn’t block the other set of changes
  • Delete a remote branch after it has been merged (if using Stash/Bitbucket then always tick the “close this branch after PR is merged” box)
  • Keep commits clean – having a PR isn’t an excuse to be lazy about your commits; use good commit messages, keep related logical changes together in a single commit, squash them together where relevant, use rebasing rather than merge commits and think about the history after the PR is merged – don’t make it harder for yourself or your team in the future
  • Don’t let PRs get stagnant – if you are finished with a task take a look at the PRs and merge a couple before moving on; if everyone does this then there will be a constant flow of merges/integration – if a PR is open for more than a day there is a problem
  • Where possible have one PR/branch per-person – you should be able to clean up your commits and force push to your pull request branch and it encourages the PR to be smaller
    • Be incredibly careful when doing a force push that you are not pushing to master
    • If you can turn off the ability to force push to master (e.g. Bitbucket allows this) then do it – better safe than sorry
  • For those situations where you do need to work closely with someone else and feel like you need a PR across multiple people there are two techniques you can use:
    • Identify and pair together on at least the blocking “integration” work, put up a PR, get it merged and then work in individual PRs after that
    • If necessary then put up a pull request and contribute to the branch of that pull request by individual pull requests (i.e. create a featurex branch with a pull request to master and then create separate featurex-subfeaturea, featurex-subfeatureb branches with pull requests to featurex) – try and avoid this because it increases the size and length of time of the integrated pull request – also, consider rebasing the integrated pull request just before it’s pulled in
    • If you plan on submitting a pull request against someone else’s pull request branch then make sure you tell them so they don’t rebase the commits on you (can you see why I try and avoid this? :P)
  • Make sure that PR merges are done with actual merge commits (the merge button in all of the PR tools I’ve used do this for you) that reference the PR number – it makes it easier to trace them later and provides some extra information about the commits that you can look up (this is the only thing you should use merge commits for in my opinion)
  • If your PR is not urgent then leave it there for someone to merge at their leisure and move on, if it’s urgent/blocking then ask the team if someone can review/merge it straight away (from what I’ve seen this is not the common case – e.g. one or two a day out of 10-15 PRs)
  • Tag your PR title with [WIP] to mark a work-in-progress PR that you are putting up for feedback to make sure you are on the right track and to reduce feedback cycles – [WIP] indicates that it’s open for people to review, but NOT to merge
  • In those rare circumstances when a PR is ready to be merged, but isn’t production-ready (maybe it’s a code clean-up waiting for a successful production release to occur first to render the code being deleted redundant for instance) then tag the PR title with [DO NOT MERGE]
  • Unless you know what you are doing with Git never work off master locally; when starting new work switch to master, pull from origin/master, and create and switch to a new branch to describe your impending PR – it makes things a lot easier
    • I almost never do this – I generally just work on master unless editing a previously submitted PR because I find it quicker, but it does require me to constant be very careful about what branches I’m pushing and pulling from!
  • Call out the good things in the code when reviewing – reviews shouldn’t devolve into just pointing out things you think are wrong otherwise they develop negative connotations – if the PR creator did a good job on a bit of complex code then give them a pat on the back!
  • Similarly, avoid emotive phrases like “ugly”, “gross” etc. when describing code; when I’m reviewing code a technique I use to try and not sound condescending is to ask a question, e.g. “Do you think this should be named XX to make it clearer?” – make the pull request into a two-way conversation; work out the best solution collaboratively as a team
  • If you aren’t sure about something don’t be afraid to ask a question; quite often this will highlight something that the author missed or at the very least help you to learn something
  • Have fun! Use funny gifs, use funny branch names – just because we are developing serious software doesn’t mean we can’t enjoy our jobs!

Getting started

I’ve found that the following helps immensely when first getting a team started (but after a while the team is able to sustain itself easily):

  • Having at least one person experienced with Git is really helpful as inevitably people will get themselves confused when first using branches and in particular with rebasing
  • Have at least one person that knows what they are doing ensure that in the first week all PRs are reviewed and merged quickly – this will alleviate the team from needing to worry about keeping on top of merging PRs and instead will concentrate on the semantics of creating pull requests, responding to review comments and updating PRs. Once they get the hang of it then stop reviewing/merging all, but the most important PRs and the team will start to notice there is an increase in PRs and will hopefully self-organise to get them merged and get into the habit of doing that
    • It’s possible this technique might not always work, but it has when I’ve tried it and it assists the team to slowly get the hang of using pull requests rather than dropping a huge change on them all at once
    • If the team doesn’t self-organise then bring up the fact PRs aren’t getting merged in retrospectives and suggest a possible solution of reviewing open PRs directly after standup and team members volunteer to review/merge each of the open ones
  • Developers will naturally approach pull requests like feature branches even if told up front to treat it the same as what you would originally push to mainline with mainline development because of the natural human desire to get things “right” / finished
    • This will lead to enormous PRs that are constantly conflicted and a real pain in the ass for the developer
    • Let this happen and when the PR is finally merged sit down with them and ask them how they think it could be avoided in the future
    • Hopefully they will come to the realisation the PR was too big themselves and because they have made that realisation based on experience they are more likely to keep PRs small

Tools

There are various tools that can help you use pull requests. The ones I have experience with or knowledge of are:

If you are interested in semver and you want to use pull requests (or you don’t and you are using mainline development) then I encourage you to check out the GitHubFlowVersion project by my friend Jake Ginnivan.

Breaking up software projects into small, focussed milestones

This post highlights a particular learning I’ve had over the last year about setting milestones for organising high-level goals for software projects.

Background

When I am first involved in a project there is undoubtedly a huge list of things that the product owner / client wants and it’s generally the case that there isn’t enough time or money for it all to be all completed. Let’s ignore for a moment the fact that there will be reasons why it’s not sensible to complete it all anyway since the low priority things will be much less important than high priority work on other projects or emergent requirements / user feedback for the current project.

Often, the person/people in question don’t really have a good understanding about what is realistic within a given time period (how could they when it’s hard enough for the development team?) and certainly the traditional way that software development has been executed exacerbates the issue by promising that a requirements specification will be delivered by a certain time. Despite the fact this almost never works people still sometimes expect software to be able to be delivered in this way. Thankfully, Agile software development becoming more mainstream means that I see this mentality occurring less and less.

An approach I’ve often taken in the past to combat this is to begin by getting product owners to prioritise high-level requirements with the MoSCoW system and then take the must-have’s and possibly the should-have’s and labelled that as phase 1 and continued to flesh out those requirements into user stories. This then leaves the less important requirements at the bottom of the backlog giving a clear expectation that any estimations and grooming effort will be expended on the most important features and won’t include “everything” that the person initially had in their vision (since that’s unrealistic).

This is a common-sense approach that most people seem to understand and allows for a open and transparent approach to providing a realistic estimate that can’t be misconstrued as being for everything. Note: I also use other techniques that help set clear expectations like user story mapping and inception decks.

The problem

These “phase 1” milestones have a number of issues:

  • They are arbitrary in make up and length, which results in a lack of focus and makes it easier for “scope creep” to occur
    • While scope-creep isn’t necessarily a problem if it’s being driven by prioritisation from a product owner it does tend to increase feedback cycles for user feedback and makes planning harder
    • Small variations to direction and scope tend to get hidden since they are comparatively small, however the small changes add up over time and can have a very large impact (positive and negative) on the project direction that isn’t intended
  • They tend to still be fairly long (3+ months)
    • This increases the size of estimation errors and the number and size of unknowns
    • I’ve noticed this also reduces the urgency/pace of everyone involved

A different approach

I’ve since learnt that a much better approach is to create really small, focused milestones that are named according to the goal they are trying to meet e.g. Allow all non-commercial customers who only have product X to use the new system (if you are doing a rewrite) or Let customers use feature Y (new feature to a system).

More specifically, having focused milestones:

  • Helps with team morale (everyone can see the end goal within their grasp and can rally around it)
  • Helps frame conversations with the product owner around prioritising stories to keep focus and not constantly increasing the backlog size (and by association how far away the end goal is)
  • Helps create more of a sense of urgency with everyone (dev team, ops, management etc.)
  • Helps encourage more frequent releases and thinking earlier about release strategies to real end users
  • Provides a nice check and balance against the sprint goal – is the sprint goal this sprint something that contributes towards our current milestone and in-turn are all the stories in the sprint backlog contributing to the sprint goal?

The end goal (probably not “end”; there is always room for improvement)

I don’t think that the approach I describe above is necessarily the best way of doing things. Instead I think it is a stepping stone for a number of things that are worth striving for:

Presentation: Moving from Technical Agility to Strategic Agility

I recently gave a presentation with my colleague Jess Panni to the ACS WA Conference about Agile and where we see it heading in the next 5-10 years. When offered the speaking slot the requirements were that it involved data analytics in some way and that it wasn’t the same old Agile stuff that everyone has been talking about for years, but rather something a bit different. Both of those requirements suited me because it fit in perfectly with thoughts I’d been having recently about Agile and where it is heading.

Jess and I had a lot of fun preparing the talk – it’s not often we get time to sit down and chat about process, research what the industry leaders are saying and brainstorm our own thoughts and experiences in light of that research. I’m very proud of the content that we’ve managed to assemble and the way we’ve structured it.

We paid particular care to make the slide deck useful for people – there are comprehensive notes on each slide and there are a bunch of relevant references at the end for further reading.

I’ve put the slides up on GitHub if you are interested.

TeamCity deployment pipeline (part 1: structure)

TeamCity (in particular version 7 onwards) makes the creation of continuous delivery pipelines fairly easy. There are a number of different approaches that can be used though and there isn’t much documentation on how to do it.

This post outlines the set up that I have used for continuous delivery and also the techniques I have used to make it quick and easy to get up and running with new applications and to maintain the build configurations over multiple applications.

While this post is .NET focussed, the concepts here apply to any type of deployment pipeline.

Maintainable, large-scale continuous delivery with TeamCity series

This post is part of a blog series jointly written by myself and Matt Davies called Maintainable, large-scale continuous delivery with TeamCity:

  1. Intro
  2. TeamCity deployment pipeline
  3. Deploying Web Applications
    • MsDeploy (onprem and Azure Web Sites)
    • OctopusDeploy (nuget)
    • Git push (Windows Azure Web Sites)
  4. Deploying Windows Services
    • MsDeploy
    • OctopusDeploy
    • Git push (Windows Azure Web Sites Web Jobs)
  5. Deploying Windows Azure Cloud Services
    • OctopusDeploy
    • PowerShell
  6. How to choose your deployment technology

Designing the pipeline

When designing the pipeline we used at Curtin we wanted the following flow for a standard web application:

  1. The CI server automatically pulls every source code push
  2. When there is a new push the solution is compiled
  3. If the solution compiles then all automated tests (unit or integration – we didn’t have a need to distinguish between them for any of our projects as none of them took more than a few minutes to run) will be run (including code coverage analysis)
  4. If all the tests pass then the web application will be automatically deployed to a development web farm (on-premise)
  5. A button can be clicked to deploy the last successful development deployment to a UAT web farm (either on-premise or in Azure)
  6. A UAT deployment can be marked as tested and a change record number or other relevant note attached to that deployment to indicate approval to go to production
  7. A button can be clicked to deploy the latest UAT deployment that was marked as tested to production and this button is available to a different subset of people that can trigger UAT deployments

Two other requirements were that there is a way to visualise the deployment pipeline for a particular set of changes and also that there was an ability to revert a production deployment by deploying the last successful deployment if something went wrong with the current one. Ideally, each of the deployments should take no more than a minute.

The final product

The following screenshot illustrates the final result for one of the projects I was working on:

Continuous delivery pipeline in TeamCity dashboard

Some things to notice are:

  • The continuous integration step ensures the solution compiles and that any tests run; it also checks code coverage while running the tests – see below for the options I use
  • I use separate build configurations for each logical step in the pipeline so that I can create dependencies between them and use templates (see below for more information)
    • This means you will probably need to either buy an enterprise TeamCity license if you are putting more than two or three projects on your CI server (or spin up new servers for each two or three projects!)
  • I prefix each build configuration name with a number that illustrates what step it is relative the the other build configurations so they are ordered correctly
  • I postfix each build configuration with an annotation that indicates whether it’s a step that will be automatically triggered or that needs to be manually triggered for it to run (by clicking the corresponding “Run…” button)
    • I wouldn’t have bothered with this if TeamCity had a way to hide Run buttons for various build steps.
    • You will note that the production deployments have some additional instructions as explained below. This keeps consistency that the postfix between the “[” and “]”are user instructions
    • In retrospect, for consistency I should have made the production deployment say “[Manual; Last pinned prod package]”
  • The production deployment is in a separate project altogether
    • As stated above – one of my requirements that a different set of users were to have access to perform production deployments
    • At this stage TeamCity doesn’t have the ability to give different permissions on a build configuration level – only on a project level, which effectively forced me to have a separate project to support this
    • This was a bit of a pain and complicates things, so if you don’t have that requirement then I’d say keep it all in one project
  • I have split up the package step to be separate from the deployment step
    • In this case I am talking about MSDeploy packages and deployment, but a similar concept might apply for other build and deployment scenarios
    • The main reason for this is for consistency with the production package, which had to be separated from the deployment for the reasons explained below under “Production deployments”
  • In this instance the pipeline also had a NuGet publish, which is completely optional, but in this case was needed because part of the project (business layer entities) was shared with a separate project and using NuGet allowed us to share the latest version of the common classes with ease

Convention over Configuration

One of the main concepts that I employ to ensure that the TeamCity setup is as maintainable as possible for a large number of projects is convention over configuration. This requires consistency between projects in order for them to work and as I have said previously, I think this is really important for all projects to have anyway.

These conventions allowed me to make assumptions in my build configuration templates (explained below) and thus make them generically apply to all projects with little or no effort.

The main conventions I used are:

  • The name of the project in TeamCity is {projectname}
    • This drives most of the other conventions
  • The name of the source code repository is ssh://git@server/{projectname}.git
    • This allowed me to use the same VCS root for all projects
  • The code is in the master branch (we were using Git)
    • As above
  • The solution file is at /{projectname}.sln
    • This allowed me to have the same Continuous Integration build configuration template for all projects
  • The main (usually web) project is at /{projectname}/{projectname}.csproj
    • This allowed me to use the same Web Deploy package build configuration template for all projects
  • The IIS Site name of the web application will be {projectname} for all environments
    • As above
  • The main project assembly name is {basenamespace}.{projectname}
    • In our case {basenamespace} was Curtin
    • This allowed me to automatically include that assembly for code coverage statistics in the shared Continuous Integration build configuration template
  • Any test projects end in .Tests and built a dll ending in .Tests in the binRelease folder of the test project after compilation
    • This allowed me to automatically find and run all test assemblies in the shared Continuous Integration build configuration template

Where necessary, I provided ways to configure differences in these conventions for exceptional circumstances, but for consistency and simplicity it’s obviously best to stick to just the conventions wherever possible. For instance the project name for the production project wasn’t {projectname} because I had to use a different project and project names are unique in TeamCity. This meant I needed a way to specify a different project name, but keep the project name as the default. I explain how I did this in the Build Parameters section below.

Build Configuration templates

TeamCity gives you the ability to specify a majority of a build configuration via a shared build configuration template. That way you can inherit that template from multiple build configurations and make changes to the template that will propagate through to all inherited configurations. This is the key way in which I was able to make the TeamCity setup maintainable. The screenshot below shows the build configuration templates that we used.

Build Configuration Templates

Unfortunately, at this stage there is no way to define triggers or dependencies within the templates so some of the configuration needs to be set up each time as explained below in the transition sections.

The configuration steps for each of the templates will be explained in the subsequent posts in this series apart from the Continuous Integration template, which is explained below. One of the things that is shared by the build configuration template is the VCS root so I needed to define a common Git root (as I mentioned above). The configuration steps for that are outlined below.

Build Parameters

One of the truly excellent things about TeamCity build configuration templates are how they handle build parameters.

Build parameters in combination with build configuration templates are really powerful because:

  • You can use build parameters in pretty much any text entry field through the configuration; including the VCS root!
    • This is what allows for the convention over configuration approach I explained above (the project name, along with a whole heap of other values, is available as build parameters)
  • You can define build parameters as part of the template that have no value and thus require you to specify a value before a build configuration instance can be run
    • This allows you to create required parameters that must be specified, but don’t have a sensible default
    • When there are parameters that aren’t required and don’t have a sensible default I set their value in the build configuration template to an empty string
  • You can define build parameters as part of the template that have a default value
  • You can overwrite any default value from a template within a build configuration
  • You can delete any overwritten value in a build configuration to get back the default value from the template
  • You can set a build configuration value as being a password, which means that you can’t see what the value is after it’s been entered (instead it will say %secure:teamcity.password.{parametername}%)
  • Whenever a password parameter is referenced from within a build log or similar it will display as ***** i.e. the password is stored securely and is never disclosed
    • This is really handy for automated deployments e.g. where service account credentials need to be specified, but you don’t want developers to know the credentials
  • You can reference another parameter as the value for a parameter
    • This allows you to define a common set of values in the template that can be selected from in the actual build configuration without having to re-specify the actual value. This is really important from a maintainability point of view because things like server names and usernames can easily change
    • When referencing a parameter that is a password it is still obscured when included in logs 🙂
  • You can reference another parameter in the middle of a string or even reference multiple other parameter values within a single parameter
    • This allows you to specify a parameter in the template that references a parameter that won’t be specified until an actual build configuration is created, which in turn can reference a parameter from the template.
    • When the web deploy post in this series is released you will be able to see an example of what I mean.
  • This is how I managed to achieve the flexible project name with a default of the TeamCity project name as mentioned above
    • In the template there is a variable called env.ProjectName that is then used everywhere else and the default value in the build configuration template is %system.teamcity.projectName%
    • Thus the default is the project name, but you have the flexibility to override that value in specific build configurations
    • Annoyingly, I had to specify this in all of the build configuration templates because there is no way to have a hierarchy of templates at this time
  • There are three types of build parameters: system properties, environment variables and configuration parameters
    • System properties are defined by TeamCity as well as some environment variables
    • You can specify both configuration parameters and environment variables in the build parameters page
    • I created a convention that configuration parameters would only be used to specify common values in the templates and I would only reference environment variables in the build configuration
    • That way I was able to create a consistency around the fact that only build parameters that were edited within an actual build configuration were environment variables (which in turn may or may not reference a configuration parameter specified in the template)
    • I think this makes it easier and less confusing to consistently edit and create the build configurations

Snapshot Dependencies

I make extensive use of snapshot dependencies on every build configuration. While not all of the configurations need the source code (since some of them use artifact dependencies instead) using snapshot dependencies ensures that the build chain is respected correctly and also provides a list of pending commits that haven’t been run for that configuration (which is really handy to let you know what state everything is in at a glance).

The downside of using snapshot dependencies though is that when you trigger a particular build configuration it will automatically trigger every preceding configuration in the chain as well. That means that if you run, say, the UAT deployment and a new source code push was just made then that will get included in the UAT deployment even if you weren’t ready to test it. In practice, I found this rarely if ever happened, but I can imagine that for a large and / or distributed team it could do so watch out for it.

What would be great to combat this was if TeamCity had an option for snapshot dependencies similar to artifact dependencies where you can choose the last successful build without triggering a new build.

Shared VCS root

The configuration for the shared Git root we used is detailed in the below screenshots. We literally used this for every build configuration as it was flexible enough to meet our needs for every project.

Git VCS Root Configuration
Git VCS Root Configuration 2

You will note that the branch name is specified as a build parameter. I used the technique I described above to give this a default value of master, but allow it to be overwritten for specific build configurations where necessary (sometimes we spun up experimental build configurations against branches).

Continuous Integration step configuration

A lot of what I do here is covered by the posts I referenced in the first post of the series apart from using the relevant environment variables as defined in the build configuration parameters. Consequently, I’ve simply included a few basic screenshots below that cover the bulk of it:

Continuous Integration Build Configuration Template 1 Continuous Integration Build Configuration Template 2 Continuous Integration Build Configuration Template - Build Step 1 Continuous Integration Build Configuration Template - Build Step 2 Continuous Integration Build Configuration Template - Build Step 2 (part 2) Continuous Integration Build Configuration Template - Build Triggering Continuous Integration Build Configuration Template - Build Parameters

Some notes:

  • If I want to build a releasable dll e.g. for a NuGet package then I have previously used a build number of %env.MajorVersion%.%env.MinorVersion%.{0} in combination with the assembly patcher and then exposed the dlls as build artifacts (to be consumed by another build step that packages a nuget package using an artifact dependency)
    • Then whenever I want to increment the major or minor version I adjust those values in the build parameters section and the build counter value appropriately
    • With TeamCity 7 you have the ability to include a NuGet Package step, which eliminates the need to do it using artifact dependencies
    • In this case that wasn’t necessary so the build number is a lot simpler and I didn’t necessarily need to include the assembly patcher (because the dlls get rebuilt when the web deployment package is built)
  • I set MvcBuildViews to false because the MSBuild runner for compiling views runs as x86 when using “Visual Studio (sln)” in TeamCity and we couldn’t find an easy way around it and thus view compilation fails if you reference 64-bit dlls
    • We set MvcBuildViews to true when building the deployment package so any view problems do get picked up quickly
  • Be careful using such an inclusive test dll wildcard specification; if you make the mistake of referencing a test project from within another test project then it will pick up the referenced project twice and try and run all the tests from it twice
  • The coverage environment variable allows projects that have more than one assembly that needs code coverage to have those extra dependencies specified
    • If you have a single project then you don’t need to do anything because the default configuration picks up the main assembly (as specified in the conventions section above)
    • Obviously, you need to change “BaseNamespace” to whatever is appropriate for you
    • I’ve left it without a default value so you are prompted to add something to it (to ensure you don’t forget when first setting up the build configuration)
  • The screens that weren’t included were left as their default, apart from Version Control Settings, which had the shared VCS root attached

Build configuration triggering and dependencies

The following triggers and dependencies are set up on the pipeline to set up transitions between configurations. Unfortunately, this is the most tedious part of the pipeline to set up because there isn’t a way to specify the triggers as part of a build configuration template. This means you have to manually set these up every time you create a new pipeline (and remember to set them up correctly).

  • Step 1.1: Continuous Integration
    • VCS Trigger – ensures the pipeline is triggered every time there is a new source code push; I always use the default options and don’t tweak it.
  • Step 2.1: Dev package
    • The package step has a build trigger on the last successful build of Step 1 so that dev deployments automatically happen after a source code push (if the solution built and the tests passed)
    • There is a snapshot dependency on the last successful build of Step 1 as well
  • Step 2.2: Dev deployment
    • In order to link the deployment with the package there is a build trigger on the last successful build of the package
    • There is also a snapshot dependency with the package step
    • They also have an artifact dependency from the same chain so the web deployment package that was generated is copied across for deployment; there will be more details about this in the web deploy post of the series
  • Step 3.1: UAT package
    • There is no trigger for this since UAT deployments are manual
    • There is a snapshot dependency on the last successful dev deployment so triggering a UAT deployment will also trigger a dev deployment if there are new changes that haven’t already been deployed
  • Step 3.2: UAT deploy
    • This step is marked as [Manual] so the user is expected to click the Run button for this build to do a UAT deployment
    • It has a snapshot on the UAT package so it will trigger a package to be built if the user triggers this build
    • There is an artifact dependency similar to the dev deployment too
    • There is also a trigger on successful build of the UAT package just in case the user decides to click on the Run button of the package step instead of the deployment step; this ensures that these two steps are always in sync and are effectively the same step
  • Step 4.1: Production package
    • See below section on Production deployments
  • Step 5: Production deployment
    • See below section on Production deployments

Production deployments

I didn’t want a production build to accidentally deploy a “just pushed” changeset so in order to have a separation between the production deployment and the other deployments I didn’t use a snapshot dependency on the production deployment step.

This actually has a few disadvantages:

  • It means you can’t see an accurate list of pending changesets waiting for production
    • I do have the VCS root attached so it shows a pending list, which will mostly be accurate, but will be cleared when you make a production deployment so if there were changes that weren’t being deployed at that point then they won’t show up in the pending list of the next deployment
  • It’s the reason I had to split up the package and deployment steps into separate build configurations
    • This in turn added a lot of complexity to the deployment pipeline because of the extra steps as well as the extra dependency and trigger configuration that was required (as detailed above)
  • The production deployment doesn’t appear in the build chain explicitly so it’s difficult to see what build numbers a deployment corresponds too and to visualise the full chain

Consequently, if you have good visibility and control over what source control pushes occur it might be feasible to consider using a snapshot dependency for the production deployment and having the understanding that this will push all current changes to all environments at the same time. In our case this was unsuitable, hence the slightly more complex pipeline. If the ability to specify a snapshot dependency without triggering all previous configurations in the chain was present (as mentioned above) this would be the best of both worlds.

Building the production package still needs a snapshot dependency because it requires the source code to run the build. For this reason, I linked the production package to the UAT deployment via a snapshot dependency and a build trigger. This makes some semantic sense because it means that any UAT deployment that you manually trigger then becomes a release candidate.

The last piece of the puzzle is the bit that I really like. One of the options that you have when choosing an artifact dependency is to use the last pinned build. When you pin a build it is a manual step and it asks you to enter a comment. This is convenient in multiple ways:

  • It allows us to mark a given potential release candidate (e.g. a built production package) as an actual release candidate
  • This means we can actually deploy the next set of changes to UAT and start testing it without being forced to do a production deployment yet
  • This gives the product owner the flexibility to deploy whenever they want
  • It also allows us to make the manual testing that occurs on the UAT environment an explicit step in the pipeline rather than an implicit one
  • Furthermore, it allows us to meet the original requirement specified above that there could be a change record number or any other relevant notes about the production release candidate
  • It also provides a level of auditing and assurance that increases the confidence in the pipeline and the ability to “sell” the pipeline in environments where you deal with traditional enterprise change management
  • It means we can always press the Run button against production deployment confident in the knowledge that the only thing that could ever be deployed is a release candidate that was signed off in the UAT environment

Archived template project

I explained above that the most tedious part of setting up the pipeline is creating the dependencies and triggers between the different steps in the pipeline. There is a technique that I have used to ease the pain of this.

One thing that TeamCity allows you to do is to make a copy of a project. I make use of this in combination with the ability to archive projects to create one or more archived projects that form a “project template” of sorts that strings together a set of build configuration templates including the relevant dependencies and triggers.

At the very least I will have one for a project with the Continuous Integration and Dev package and deployment steps already set up. But, you might also have a few more for other common configurations e.g. full pipeline for an Azure website as well as full pipeline for an on-premise website.

Furthermore, I actually store all the build configuration templates against the base archived project for consistency so I know where to find them and they all appear in one place.

Archived Project Template with Build Configuration Templates

Web server configuration

Another aspect of the convention over configuration approach that increases consistency and maintainability is the configuration of the IIS servers in the different environments. By configuring the IIS site names, website URLs and server configurations the same it made everything so much easier.

In the non-production environments we also made use of wildcard domain names to ensure that we didn’t need to generate new DNS records or SSL certificates to get up and running in development or UAT. This meant all we had to do was literally create a new pipeline in TeamCity and we were already up and running in both those environments.

MSBuild import files

Similarly, there are certain settings and targets that were required in the .csproj files of the applications to ensure that the various MSBuild commands that were being used ran successfully. We didn’t want to have to respecify these every time we created a new project so we created a set of .targets files in a pre-specified location (c:msbuild_curtin in our case -we would check this folder out from source control so it could easily be updated; you could also use a shared network location to make it easier to keep up to date everywhere) that contained the configurations. That way we simply needed to create a single import directive in the .csproj (or .ccproj) that included the relevant .targets file and we were off and running.

The contents of these files will be outlined in the rest of the posts in this blog series.

Build numbers

One of the things that is slightly annoying by having separate build configurations is that by default they all use different build numbers so it’s difficult to see at a glance what version of the software is deployed to each environment without looking at the build chains view. As it turns out, there are a number of possible ways to copy the build number from the first build configuration to the other configurations. I never got a chance to investigate this and figure out the best approach though.

Harddrive space

One thing to keep in mind is that if you are including the deployment packages as artifacts on your builds (not to mention the build logs themselves!) the amount of harddrive space used by TeamCity can quickly add up. One of the cool things in TeamCity is that if you are logged in as an admin it will actually pop up a warning to tell you when the harddrive space is getting low. Regardless, there are options in the TeamCity admin to set up clean-up rules that will automatically clean up artifacts and build history according to a specification of your choosing.

Production database

One thing that isn’t immediately clear when using TeamCity is that by default it ships with a file-based database that isn’t recommended for production use. TeamCity can be configured to support any one of a number of the most common database engines though. I recommend that if you are using TeamCity seriously that you investigate that.

Update 7 September 2012: Rollbacks

I realised that there was one thing I forgot to address in this post, which is the requirement I mentioned above about being able to rollback a production deployment to a previous one. It’s actually quite simple to do – all you need to do is go to your production deployment build configuration, click on the Build Chains tab and inspect the chains to see which deployment was the last successful one. At that point you simply expand the chain and then click on the TeamCity trigger previous build as custom build button button to open the custom build dialog and then run it.