Posts - Page 9 of 22
Using Pull Requests for commercial/private/proprietary development
Developers who are familiar with open source will likely be aware of pull requests, which were created and popularised by GitHub as a way of providing some automation, visibility and social interaction around merging software changes. It replaces the old school notion of emailing patch files to each other as well as providing more visibility and interaction over pushing to the same branch (aka mainline development).
This is a post I’ve been meaning to write for a while. As a Senior Consultant for Readify I find myself spending a lot of time mentoring teams and as a part of that I’m constantly paying attention to trends I notice and picking up / experimenting with techniques and processes that I can introduce to teams to get positive outcomes (improved efficiency, quality, maintainability, collaboration etc.). In my experiences so far I’ve found that the single most effective change I have introduced to a software team that has improved the quality of software output is to introduce pull requests. It also has a positive effect on other things like improved collaboration and collective code ownership.
I can’t claim to have invented this idea - I got the idea from listening to the experiences that my fellow consultants Graeme Foster and Jake Ginnivan had and there are certainly examples of companies using it, not the least of which is GitHub itself.
What are pull requests?
Essentially, a pull request allows a developer to make some commits, push those commits to a branch that they have access to and create a request to a repository that they don’t necessarily have access (but also could) to have their commit(s) merged in. Depending on the software that you are using, a pull request will generally be represented as a diff, a list of the commits and a dialogue of comments both against the pull request itself and against individual lines of code in the diff.
Why are pull requests useful for open source?
Pull Requests are awesome for open source because they allow random third parties to submit code changes easily and effectively to projects that they don’t usually have commit access to and for project maintainers to have a conversation with the contributor to get the changes in a state where they can be merged.
This lowers the barrier of entry to both maintainers and contributors, thus making it easier for everyone involved :).
Why are pull requests useful for commercial development?
On first thought you might think that commercial development is significantly different from open source since the people submitting code changes are on the same team and there will be existing code review processes in place to deal with code quality.
Based on my experiences and other people I know there are definitely a range of advantages to using pull requests for commercial development (in no particular order):
- If you have enough developers and a big enough project/product then you might actually have a similar setup to an open source project in that the product might have a “core team” that look after the product and maintaining standards and consistency and other developers in the company can then submit pull requests to be reviewed by the core team
- Pull requests give a level playing field to the whole team - it encourages more junior or shy developers to “safely” review and comment on commits by more senior developers since it takes away the intimidation of doing it in person
- It provides a platform for new developers to get up-skilled quicker by providing:
- Easy to digest, focussed examples of how to implement certain features that they can browse easily without having to ask how to find them
- Confidence to raise pull requests without the stress of needing to know if the code is OK
- Line-by-line comments to help them identify what they need to change to get their code “up to scratch”
- This has a positive effect on the new developer themselves as well as the team since the learning is founded in practical application (which in my view is always the most efficient way to learn) and the person potentially needs less time from the team to get up to speed (thus having a smaller “burden”)
- In much the same way as it helps with up-skilling it helps with learning for more junior developers
- It improves the safety of making changes to the codebase by ensuring that changes are looked at by at least one person before being merged into mainline
- It improves the consistency/maintainability and quality of the codebase by ensuring all changes are reviewed by at least one person
- It makes it more likely that changes that would normally pass code review because they were minor enough that it didn’t warrant rejecting a code review / going back and fixing what was already there are more likely to get fixed before being merged in
- There is tooling out there (e.g. TeamCity’s pull request integration) that can integrate with pull requests to provide assurances that the pull request is OK to merge (and potentially automatically merge it for you)
- The ability to have threaded comments on individual lines of code makes it really easy to have a contextual conversation about parts of the code and arrive at better designs because of it
- Despite what code review processes you might have in place I suspect that most of the time it’s more likely code will get reviewed using pull requests because in my experience code reviews don’t always happen regardless of what development processes exist, whereas a pull request would always be reviewed before merging
- Pull requests are pull-based rather than push-based i.e. the reviewer can review the code when they are not in the middle of something so there is less context-switching overhead (note: important/blocking pull requests might be a reason to interrupt someone still, but in my experience those pull requests are not the majority)
- If you are working with developers across multiple timezones then the above point means that code reviews are easier to perform since they are pull-based
- You can use pull requests to raise work-in-progress for early feedback (reducing feedback cycles :)) or for putting up the result of spikes for team comment
- If approached in the right way then it helps teams with collective code ownership because there is a lot of back-and-forth collaboration over the code and everyone in the team is free to review any pull request they are interested in
As you can see - it’s a pretty convincing list of points and this summarises pretty well why pull requests are the most effective change I’ve made to a development team that has affected code quality.
So, nothings perfect; what are the downsides/gotchas?
There are only a few I’ve noticed so far:
- If there is a lot of churn on the codebase from a lot of disparate teams/developers with potentially conflicting, high-priority tasks and there is no “core team” looking after the project then there is a tendency for developers to:
- Work with other developers in their team to quickly merge the pull request without much inspection
- Worse still - developers might just merge their own pull requests or bypass them completely and push directly to the branch
- Acknowledge the review comments, but claim that their work is too high priority to not merge straight away and never get around to making the changes
- If there is someone in the team that is particularly picky then it’s possible that team members might try and avoid that person reviewing their pull requests by asking someone else to review it
- If there are a lot of changes to be made based on review comments it can sometimes get tricky to keep track of which ones have and haven’t been applied and/or the pull request could drag on for a long time while the developer completes the changes - there is a balance here between merging in the potentially important work in the pull request and spending time fixing potentially minor problems
- If pull requests are too big they are harder to review and there will be less review comments as a result and things might get missed (this would be a problem without pull requests, but pull requests simply highlight the problem)
- The diff UI on GitHub and Bitbucket frankly suck compared to local diffing tools. Stash’s pull request diff UI is actually really good though; this does make it a bit more difficult for bigger PRs (but then it’s still easy to pull and review the changes locally so you can get around this limitation)
- Comments on controversial code changes (or sometimes not so controversial changes) can sometimes get a bit out of hand with more comments than lines of code changed in the PR - at this point it’s clear that the back-and-forth commenting has taken the place of what should be an in-person (or if you are remote video) conversation next to a whiteboard
- You end up with a lot of local and remote branches that clutter things up, but there are techniques to deal with it
These behaviours are generally unhealthy and are likely to be a problem if you don’t have pull requests anyway (and in fact, pull requests simply serve to visualise the problems!) and should be addressed as part of the general continuous improvement that your team should be undergoing (e.g. raise the problems in retrospectives or to management if serious enough). Hopefully, you are in a situation with a self-organising team that will generally do the right things and get on board with the change :).
Pull requests and continuous delivery
I’m extremelly passionate about Continuous Delivery and as such I have been in favour of a mainline development approach over feature branches for quite a while. Thus, on first hearing about using pull requests I was initially wary because I mistakenly thought it was a manifestation of feature branches. What I’ve since learnt is it definitely doesn’t have to be, and in my opinion shouldn’t be, about creating feature branches.
Instead, the same mindset that you bring to mainline development should be taken to pull requests and the same techniques to enable merging of partially complete features in a way that keeps the code both constantly production-ready and continuously integrated should be used. The way I see it developers need to have a mindset of looking at a pull request as being “a set of changes I want to merge to master” rather than “a complete feature ready to be integrated”. This helps keep pull requests small, easy to review and ensures all developers are regularly integrating all changes against the mainline.
Introducing pull requests to the deployment pipeline adds another step that increases the confidence in your production deployments (the whole point of a deployment pipeline) as well as helping make sure that code in the mainline is always production-ready since the reviewer can (and should) review the changeset for it’s likely effect on production as well as the usual things you would review.
Side note: I love this git branching model description.
- Keep PRs small (I think I’ve explained why above well enough)
- Keep PRs focussed - a PR should contain one logical set of changes - it makes it easier to review and means if there is a problem with one set of changes that it doesn’t block the other set of changes
- Delete a remote branch after it has been merged (if using Stash/Bitbucket then always tick the “close this branch after PR is merged” box)
- Keep commits clean - having a PR isn’t an excuse to be lazy about your commits; use good commit messages, keep related logical changes together in a single commit, squash them together where relevant, use rebasing rather than merge commits and think about the history after the PR is merged - don’t make it harder for yourself or your team in the future
- Don’t let PRs get stagnant - if you are finished with a task take a look at the PRs and merge a couple before moving on; if everyone does this then there will be a constant flow of merges/integration - if a PR is open for more than a day there is a problem
- Where possible have one PR/branch per-person - you should be able to clean up your commits and force push to your pull request branch and it encourages the PR to be smaller
- Be incredibly careful when doing a force push that you are not pushing to master
- If you can turn off the ability to force push to master (e.g. Bitbucket allows this) then do it - better safe than sorry
- For those situations where you do need to work closely with someone else and feel like you need a PR across multiple people there are two techniques you can use:
- Identify and pair together on at least the blocking “integration” work, put up a PR, get it merged and then work in individual PRs after that
- If necessary then put up a pull request and contribute to the branch of that pull request by individual pull requests (i.e. create a featurex branch with a pull request to master and then create separate featurex-subfeaturea, featurex-subfeatureb branches with pull requests to featurex) - try and avoid this because it increases the size and length of time of the integrated pull request - also, consider rebasing the integrated pull request just before it’s pulled in
- If you plan on submitting a pull request against someone else’s pull request branch then make sure you tell them so they don’t rebase the commits on you (can you see why I try and avoid this? :P)
- Make sure that PR merges are done with actual merge commits (the merge button in all of the PR tools I’ve used do this for you) that reference the PR number - it makes it easier to trace them later and provides some extra information about the commits that you can look up (this is the only thing you should use merge commits for in my opinion)
- If your PR is not urgent then leave it there for someone to merge at their leisure and move on, if it’s urgent/blocking then ask the team if someone can review/merge it straight away (from what I’ve seen this is not the common case - e.g. one or two a day out of 10-15 PRs)
- Tag your PR title with [WIP] to mark a work-in-progress PR that you are putting up for feedback to make sure you are on the right track and to reduce feedback cycles - [WIP] indicates that it’s open for people to review, but NOT to merge
- In those rare circumstances when a PR is ready to be merged, but isn’t production-ready (maybe it’s a code clean-up waiting for a successful production release to occur first to render the code being deleted redundant for instance) then tag the PR title with [DO NOT MERGE]
- Unless you know what you are doing with Git never work off master locally; when starting new work switch to master, pull from origin/master, and create and switch to a new branch to describe your impending PR - it makes things a lot easier
- I almost never do this - I generally just work on master unless editing a previously submitted PR because I find it quicker, but it does require me to constant be very careful about what branches I’m pushing and pulling from!
- Call out the good things in the code when reviewing - reviews shouldn’t devolve into just pointing out things you think are wrong otherwise they develop negative connotations - if the PR creator did a good job on a bit of complex code then give them a pat on the back!
- Similarly, avoid emotive phrases like “ugly”, “gross” etc. when describing code; when I’m reviewing code a technique I use to try and not sound condescending is to ask a question, e.g. “Do you think this should be named XX to make it clearer?” - make the pull request into a two-way conversation; work out the best solution collaboratively as a team
- If you aren’t sure about something don’t be afraid to ask a question; quite often this will highlight something that the author missed or at the very least help you to learn something
- Have fun! Use funny gifs, use funny branch names - just because we are developing serious software doesn’t mean we can’t enjoy our jobs!
I’ve found that the following helps immensely when first getting a team started (but after a while the team is able to sustain itself easily):
- Having at least one person experienced with Git is really helpful as inevitably people will get themselves confused when first using branches and in particular with rebasing
- Have at least one person that knows what they are doing ensure that in the first week all PRs are reviewed and merged quickly - this will alleviate the team from needing to worry about keeping on top of merging PRs and instead will concentrate on the semantics of creating pull requests, responding to review comments and updating PRs. Once they get the hang of it then stop reviewing/merging all, but the most important PRs and the team will start to notice there is an increase in PRs and will hopefully self-organise to get them merged and get into the habit of doing that
- It’s possible this technique might not always work, but it has when I’ve tried it and it assists the team to slowly get the hang of using pull requests rather than dropping a huge change on them all at once
- If the team doesn’t self-organise then bring up the fact PRs aren’t getting merged in retrospectives and suggest a possible solution of reviewing open PRs directly after standup and team members volunteer to review/merge each of the open ones
- Developers will naturally approach pull requests like feature branches even if told up front to treat it the same as what you would originally push to mainline with mainline development because of the natural human desire to get things “right” / finished
- This will lead to enormous PRs that are constantly conflicted and a real pain in the ass for the developer
- Let this happen and when the PR is finally merged sit down with them and ask them how they think it could be avoided in the future
- Hopefully they will come to the realisation the PR was too big themselves and because they have made that realisation based on experience they are more likely to keep PRs small
There are various tools that can help you use pull requests. The ones I have experience with or knowledge of are:
- GitHub Enterprise
- Stash (as mentioned above, this has by far the best diff UI for PRs)
If you are interested in semver and you want to use pull requests (or you don’t and you are using mainline development) then I encourage you to check out the GitHubFlowVersion project by my friend Jake Ginnivan.
Breaking up software projects into small, focussed milestones
This post highlights a particular learning I’ve had over the last year about setting milestones for organising high-level goals for software projects.
When I am first involved in a project there is undoubtedly a huge list of things that the product owner / client wants and it’s generally the case that there isn’t enough time or money for it all to be all completed. Let’s ignore for a moment the fact that there will be reasons why it’s not sensible to complete it all anyway since the low priority things will be much less important than high priority work on other projects or emergent requirements / user feedback for the current project.
Often, the person/people in question don’t really have a good understanding about what is realistic within a given time period (how could they when it’s hard enough for the development team?) and certainly the traditional way that software development has been executed exacerbates the issue by promising that a requirements specification will be delivered by a certain time. Despite the fact this almost never works people still sometimes expect software to be able to be delivered in this way. Thankfully, Agile software development becoming more mainstream means that I see this mentality occurring less and less.
An approach I’ve often taken in the past to combat this is to begin by getting product owners to prioritise high-level requirements with the MoSCoW system and then take the must-have’s and possibly the should-have’s and labelled that as phase 1 and continued to flesh out those requirements into user stories. This then leaves the less important requirements at the bottom of the backlog giving a clear expectation that any estimations and grooming effort will be expended on the most important features and won’t include “everything” that the person initially had in their vision (since that’s unrealistic).
This is a common-sense approach that most people seem to understand and allows for a open and transparent approach to providing a realistic estimate that can’t be misconstrued as being for everything. Note: I also use other techniques that help set clear expectations like user story mapping and inception decks.
These “phase 1” milestones have a number of issues:
- They are arbitrary in make up and length, which results in a lack of focus and makes it easier for “scope creep” to occur
- While scope-creep isn’t necessarily a problem if it’s being driven by prioritisation from a product owner it does tend to increase feedback cycles for user feedback and makes planning harder
- Small variations to direction and scope tend to get hidden since they are comparatively small, however the small changes add up over time and can have a very large impact (positive and negative) on the project direction that isn’t intended
- They tend to still be fairly long (3+ months)
- This increases the size of estimation errors and the number and size of unknowns
- I’ve noticed this also reduces the urgency/pace of everyone involved
A different approach
I’ve since learnt that a much better approach is to create really small, focused milestones that are named according to the goal they are trying to meet e.g. Allow all non-commercial customers who only have product X to use the new system (if you are doing a rewrite) or Let customers use feature Y (new feature to a system).
More specifically, having focused milestones:
- Helps with team morale (everyone can see the end goal within their grasp and can rally around it)
- Helps frame conversations with the product owner around prioritising stories to keep focus and not constantly increasing the backlog size (and by association how far away the end goal is)
- Helps create more of a sense of urgency with everyone (dev team, ops, management etc.)
- Helps encourage more frequent releases and thinking earlier about release strategies to real end users
- Provides a nice check and balance against the sprint goal - is the sprint goal this sprint something that contributes towards our current milestone and in-turn are all the stories in the sprint backlog contributing to the sprint goal?
The end goal (probably not “end”; there is always room for improvement)
I don’t think that the approach I describe above is necessarily the best way of doing things. Instead I think it is a stepping stone for a number of things that are worth striving for:
- Approaching everything in terms of strict minimum viable products and using that to drive decisions based on how real users interact with those minimum viable products
- Driving everything from strategic direction
Presentation: Moving from Technical Agility to Strategic Agility
I recently gave a presentation with my colleague Jess Panni to the ACS WA Conference about Agile and where we see it heading in the next 5-10 years. When offered the speaking slot the requirements were that it involved data analytics in some way and that it wasn’t the same old Agile stuff that everyone has been talking about for years, but rather something a bit different. Both of those requirements suited me because it fit in perfectly with thoughts I’d been having recently about Agile and where it is heading.
Jess and I had a lot of fun preparing the talk - it’s not often we get time to sit down and chat about process, research what the industry leaders are saying and brainstorm our own thoughts and experiences in light of that research. I’m very proud of the content that we’ve managed to assemble and the way we’ve structured it.
We paid particular care to make the slide deck useful for people - there are comprehensive notes on each slide and there are a bunch of relevant references at the end for further reading.
I’ve put the slides up on GitHub if you are interested.
Using NTestDataBuilder with classes that have inheritance
If you are building test data using the NTestDataBuilder library I released and you are dealing with classes that have an inheritance chain then you might want to have a base builder class that abstracts common properties.
I’ve created a gist that explains how to achieve this with two different options at: https://gist.github.com/robdmoore/7671127.
ChameleonForms 1.0 Released
I’m incredibly excited and proud to finally announce the release of 1.0 of the ChameleonForms library I’ve been working on with Matt Davies. My blog has been fairly quiet the last couple of months while I’ve poured time and energy in finally getting ChameleonForms to 1.0.
(Apologies; I’m releasing months of built-up anticipation and excitement here).
I’m biased of course, but I think this library is amazing to use and results in immensely more maintainable form generation code when using ASP.NET MVC. It extends on the knowledge that MVC developers would have in generating forms using the (already pretty awesome) built-in stuff, but adds the things I think are missing. For me, this library epitomises over 7 years of exploration in the best way to do web-based forms and I’m excited to be able to share the beginnings of my current vision via this library.
What is ChameleonForms?
In short, ChameleonForms takes away the pain and repetition of building forms with ASP.NET MVC by following a philosophy of:
- Model-driven defaults (e.g. enum is drop-down,
[DataType(DataType.Password)]is password textbox)
- DRY up your forms - your forms will be quicker to write and easier to maintain and you won’t get stuck writing the same form boilerplate markup form after form after form
- Consistent - consistency of the API and form structure within your forms and consistency across all forms in your site via templating
- Declarative syntax - specify how the form is structured rather than worrying about the boilerplate HTML markup of the form; this has the same beneficial effect as separating HTML markup and CSS
- Beautiful, terse, fluent APIs - it’s a pleasure to read and write the code
- Extensible and flexible core - you can extend or completely change anything you want at any layer of ChameleonForms and you can drop out to plain HTML at any point in your form for those moments where pre-prepared field types and templates just don’t cut it
What are the big improvements in 1.0?
We’ve been releasing pretty often so that depends on what version you are currently using, but these are the most important things:
- Extensive usage across a number of production websites - we are happy that this library is mature, stable and ready for prime-time
- Twitter Bootstrap 3 Template out-of-the-box supported by a NuGet package to get you up and running faster - this is HUGE for a number of reasons:
- Bootstrap is pretty darn popular right now so this is immediately useful to a lot of people
- In creating this template we had to do some pretty sophisticated changes to allow the template to drive a lot of changes unobtrusively to the form structure you are adding in your views - this is great because it means it’s really easy for you to create your own form templates and accomplish similarly complex transformations of your form markup
- The ASP.NET MVC templates that come with Visual Studio 2013 come with Bootstrap by default now - and boy do they have gross repetitive boilerplate in them, which you can clean right up using this library
- The vision that we have for this library is coming to fruition, which is personally gratifying - this is a beautiful demonstration of being able to declaratively specify the structure of your form and then completely change the markup/template of your form across a whole application with a single line of code when it changes
- Really comprehensive documentation of everything in the library - we’ve spent many hours writing up the documentation - the idea was to make it comprehensive, but accessible/terse; hopefully we’ve met that goal!
- Really solid code coverage to help prevent regressions or breaking changes as well as some refactorings that give us a solid codebase to continue with the other features we want to add - hopefully this can support us into the future with minimal breaking changes
How can I get it?
Checkout the GitHub release or go to NuGet.
From this point on we are following semver thanks to the GitHubFlowVersion project. The fourth number in the NuGet version number is actually build metadata.
Borderlands 2 here I come!
Over a year ago now (yes it’s been a long journey - our first NuGet package was published on November 1, 2012) Matt Davies and I made an agreement to each other that neither of us would play the recently released Borderlands 2 game (we were both huge fans of Borderlands so this was a big deal) until we released 1.0 of ChameleonForms so that we would remain focussed on it and not get distracted. Now, while we both didn’t realise that it would take this long and while the last couple of months have seemed like forever (I’m pretty sure we had a phone conversation at least once a week where one of us would say “dude, we are sooooo close to 1.0 and BORDERLANDS 2!”) we are both incredibly proud of the library and are happy with what we’ve managed to get into 1.0.
Needless to say, we will probably be taking a break from open source for a few weekends to play Borderlands 2 :)
We hope you enjoy using the library!
As usual hit us up with issues and pull requests on GitHub - they make our day :)