Most people who have used a source code management (SCM) system in any serious way will have found the need to create multiple branches of their repository at some point. Some people avoid using branches because they don’t understand them, don’t feel they need them, or because they’re a little afraid of the complexity they might bring – however, branch management is something that all serious developers should be comfortable with. I was reviewing my Subversion branch procedures recently (more on why another time) and since I’ve talked to other developers before who find this sort of stuff daunting, I felt I could probably share some insight on the subject, and tools that I’ve found useful.
The most common reasons for creating a branch of your repository are:
- Creating a ‘stable’ branch which is used to maintain a particular major version of your product, and to construct future bugfix releases from – the point being to keep it insulated from new feature development which might introduce instability and incompatible / disruptive changes
- Creating a ‘development’ branch which is used to isolate experimental or disruptive work from the rest of ongoing development, while still being accessible to multiple people and having a full history
Separating work out like this is a very powerful way of managing major development tracks. Anyone who has deployed a product, even internally, should see the benefit of ‘stable’ branches for maintenance purposes – there’s nothing worse than having an urgent bugfix that needs doing in production, but not being able to deploy a new build from the repository without either bringing other changes with it, or manually hacking together a special patched build to handle the fix. ‘Development’ branches mostly kick in only when your development team gets larger, where a few developers want to work together on a major change which will take a while to finish and is likely to be unstable for a while, but they need to collaborate via version control while at the same time not disrupting any other developers on the team who are working on other things. They can work together in their own branch for a while until the work is ready to be merged back into the main trunk.
High-level approach to merging
There are a few ways to manage merging changes between branches; for example if you have a release branch, you might handle synchronisation one of two ways:
- Applying bugfixes to the trunk, then periodically merging them into the release branch
- Applying bugfixes to the release branch (e.g. v1-0), and then merging them into the trunk
Personally, I’ve always preferred the latter, because I think it’s most important to turn around bugfixes quickly against the stable code branch, so it makes sense to make the changes there; you can get your testing done and a patch release out ASAP. The merging process into the trunk, and any associated conflict resolution and re-testing, can happen less urgently, which is acceptable because those on the trunk are likely to be more tolerant of transient issues (developers or early adopters). If you do it the other way around, the trunk gets the bugfix first, and the stable version has to wait for a merge & re-test which is slower and just feels all wrong to me. Plus, the trunk will have a load of changes in it that you don’t want in a stable release branch, so merging from the trunk into a stable branch requires cherry picking just the commits which are related to bugfixes, which is an incredibly manual process – and what if a developer combined a bugfix and new feature in the same commit? No, IMO having the discipline to commit bugfixes alone to the release branches, and merging them wholesale into the trunk later is the way to go – if you merge from a stable branch into the trunk you’re going to want to merge everything in almost all cases, so it’s a lot easier and less error-prone.
Development branches can be a little different, since although you’ll want to merge the work back into the trunk eventually, you may also want to synchronise changes from the trunk into the development branch on occasion too, depending on how long the development branch runs for. This introduces an interesting ‘bidirectional merge’ problem which is especially difficult to deal with manually. In this article I’m going to concentrate mainly on the more common ‘stable’ branch merging procedures, both because they’re simpler to explain, and they’re more common.
Branch Merging in Detail
Simply put, you want to take a bunch of changes from one branch, and put them into another branch (or the trunk, which we’ll just treat as a special type of branch). The main issue is knowing what changes to you want to merge! In busy systems, there can be a lot of commits going on so it’s not something you can just remember or entrust to a Post-It note. Some systems like Perforce (and the next major version of Subversion, currently still in beta and therefore not fit for production use yet) have features to automatically track branch merges automatically – this is a very good thing, but let’s for a second assume you’re like most people right now, and you’re using the current Subversion stable, 1.4.x.
Let’s assume you’ve made a branch off the trunk (using ‘svn copy’) for a public release of your product, which you would typically do at the point you enforce a ‘feature freeze’. Let’s say that since then work has happened on both branches – bugfixes in preparation for your release candidate, and new development in the trunk. Your repository might look something like the diagram below.
As you can see, the ‘feature freeze’ came just after revision 5 was committed, splitting off to the ‘1.0’ branch which is being prepared for release. Since then, bugfix commits have happened to this branch resulting in revisions 6, 7, 9 and 12 – it’s of course important to educate developers to commit bugfixes to these stable branches and not the trunk (there are ways to handle it if they slip up, but I’ll leave that for now). In parallel, developers have continued to evolve the trunk through revisions 8, 10, 11 and 13. I’ve deliberately interleaved the revision numbers between the trunk and the branch, because in practice that’s what happens as development goes on simultaneously in both branches and the trunk.
So, let’s say we’re ready to perform a merge – that is, we want to take the bugfixes which have been committed to the 1.0 branch and apply them to the trunk too. That means applying the changes contained in revisions 6, 7, 9 and 12 to the trunk. With Subversion, you will do what’s known as a ‘pull merge’, which means obtaining a working copy of the current trunk, ‘pulling’ the revisions that you need from the 1.0 branch into your working copy, testing the merged result and then committing it to your trunk. I’ll talk about how you can do that later, but let’s for a second assume you’ve done that, and see what it might look like a little while after that first merge, after some more development, when you come to do the second merge, because this is where you start to see some more complexity (see the next diagram).
So, here you can see that we’ve merged revisions 6, 7, 9 and 12 into the trunk as we planned, which became revision 14. Since then we’ve done one more (non-bugfix) commit on the trunk (rev 16), and 2 more bugfixes on the 1.0 branch. So to merge again, I need to specifically merge revisions 15 and 17 onto the trunk and avoid applying the changes in revisions 6, 7, 9 and 12 again.
As you can see, keeping track of this stuff manually is a pain. If you’re not careful, you could accidentally merge a revision twice (which might result in a conflict or duplicated additions), or worse, miss a change entirely. And this is the simple form of merging – just a unidirectional merge from a stable branch into the trunk; as you can imagine bidirectional merges to/from a development branch raise the stakes still higher. Sure, you could record the revisions you’ve merged in commit messages and review those for next time you merge, but that’s far too manual. What’s really needed is a structured , mostly automated approach to dealing with the merging process so that it’s simple and repeatable – not only will this avoid problems in the repository, a quick & painless approach will encourage you to merge more often and thus keep branches in sync regularly.
Svnmerge to the rescue
Enter a wonderful little Python script called Svnmerge. Svnmerge is designed to keep a record of what revisions have been merged between branches, and thus to provide you with an easy way to pick up just the changes since the last time you merged. It can also generate you commit messages you can use to describe the merge in detail, and give you a ‘dry run’ report with a changelog of the changes you haven’t merged yet. In short, it’s super-awesome.
If you grab any of the full packages of Subversion, svnmerge is actually already included. However, if you use TortoiseSVN, or even if you download the basic command-line binaries package of Subversion, you won’t have it yet. Personally to get set up on Windows, I just download the command-line tool setup package for Subversion (you need the command-line svn tool), then download the binary executable version of Svnmerge which doesn’t require Python, which I place in the Subversion ‘bin’ folder alongside svn.exe so it’s on the path too. On other systems you’ll need Python installed to run the script directly.
Once you’ve got it installed, you need to initialise each branch you want to merge into. So for the above example, since I’m merging into the trunk, I’d open a command prompt, go to the root of my trunk working copy (this must have no outstanding uncommitted changes, svnmerge will check and stop if you have), and do this:
svnmerge init https://server/pathtorepos/myproject/branches/1.0
This is on the assumption that you haven’t performed any merges between the 1.0 branch and the trunk yet. If you’ve been using a manual form of merging in the past, and therefore you are not going to want to merge all changes since the branch was taken into the trunk on the next merge, you need to use an extra parameter:
svnmerge init https://server/pathtorepos/myproject/branches/1.0 -r1-NNNN
Where ‘NNNN’ is the revision on the source branch (branches/1.0 in this case) that was last manually merged into the trunk.
You’re not quite done yet; svnmerge never alters your repository directly itself, you now have to commit the ‘initialisation’. It also generates a commit message for you to explain, so to finish, you just need to do this:
svn commit -F svnmerge-commit-message.txt
So what has this actually done? Well, svnmerge uses Subversion’s ability to associate version-tracked properties with any file or directory in the repository – it creates a property on the root folder recording the status of merges from any other branch. If you look at the properties on the root folder after doing the init just before the first merge above, you’d see something like this:
svnmerge-integrated = /branches/1.0:1-5
That indicates that svnmerge will be processing revision 6 onwards from the 1.0 branch the next time you ask it to merge. You can have multiple entries in there if there are multiple branches you might be merging from. Note that the property is present on the branch that is to receive the merge – there is no property on the branches you pull the merges from, unless they also pull merges from other places too. You can view what’s available for merging on the trunk by doing this in your trunk root working folder:
.. which would report all the revision numbers that are available for merging – if you want to see what those revisions actually did, add the ‘–log’ option and you’ll get the detail of what files were involved and what the commit message was in each case. Pretty damn useful. If you then want to merge, you just do:
.. and your working copy will pull in all those changes. Note that for both ‘avail’ and ‘merge’, if you do have multiple source branches registered you can specify which one you want to merge from with the ‘–source’ option. Once you’ve tested and committed the changes (again, use the svnmerge-commit-message.txt which is generated, that will contain all the details of what it is you’re merging), along with the files the svnmerge-integrated property will get updated to reflect where you’ve merged up to, ensuring that next time you will only merge new changes since the last merge.
Svnmerge is even able to handle the more complex cases; bidirectional merges and the case where you cherry-pick revisions to merge rather than doing all of them. In all cases it stores the revisions that you merged (and if you cherry pick, this may incorporate lists of specific revisions instead of a full range) and is able to take into account svnmerge-integrated properties on other branches to avoid ‘bouncing’ a merge between branches that are merging to/from each other (only really the case in long-running development branches, not stable release branches).
Branches are an incredibly important tool in managing any significant software project – while they can be confusing at first, tools like Svnmerge significantly reduce the effort that needs to go into management of multiple branches. Subversion 1.5’s merge tracking will likely make this even easier to manage, particularly if the GUI tools like TortoiseSVN and Eclipse expose the functionality in friendly ways – however there’s absolutely no reason to wait for that to happen, Svnmerge will give you all the power you really need right now on the stable Subversion.