Getting more structured with my DVCS tests

Ok, so I’ve posted my initial feelings about tinkering with Mercurial and Git, and that seems to have generated some interest. It’s time to get a bit more formal about how I’m going to evaluate them against each other, to decide which one I like to use most in real, practical scenarios. So, I decided to come up with a list of use cases for the things that I typically have to deal with when managing the repositories for a software project (open source and otherwise), so that I can methodically test them out and see how I feel about each system. I’ve tried to deliberately include use cases for things that you can’treally do on a centralised system, but that I’d want to make use of, as well as the usual nonsense that happens day-to-day on a typical project ;)

I’ve presented my work-in-progress list below, feel free to suggest more use cases I can test. I really want to see what these things are like in practice in the kinds of situations I encounter in real projects, without actually risking a real project in the process! Like a backup / restore strategy, it’s no good doing it for the first time when the sh*t has already hit the fan.

Oh, and for fun, allegedly Git is MacGyver and Mercurial is James Bond . :D


This list will be periodically updated as I think of new things, and as other things are suggested by commenters.

Due to the nature of a DVCS, all these use cases must be tested both in isolation, and  after pushing those changes to another potentially ‘superset’ repository.

  1. General
    1. Commit a few changes to local repository ‘A’ but don’t push to central yet. Push a number of different changes to the central repository from a different repository ‘B’. Then, pull ‘B’s changes from the central repos to local repository ‘A’  to bring it up to date, again without pushing A’s outstanding changes yet. This is equivalent to doing an ‘svn update’ while you have local uncommitted changes in Subversion, but while using the local commit features of the DVCS. How does this work in practice? [Suggested by Arseny Kapoulkine]
  2. Branching
    1. Create a new public, official branch (stable branch)
    2. Create a new long-term feature branch which is intended for public consumption / collaboration
    3. Create lots of short-term development branches (or equivalent structures) intended for local consumption only
      1. What are the size overheads? (Git claims superiority here)
      2. Does this excessively clutter the public repository when pushed?
      3. Is there a better way of handling multiple local changesets which you may or may not decide to push individually, such as testing many patches (Mercurial queues seem very interesting, Git equivalent?)
      4. . Rename a branch? (optional)
  3. Merging
    1. Merge changes unidirectionally from one branch to another, without having to manually pick revisions. Make more changes in the source branch and repeat.
    2. Bidirectional merge – a feature branch which is not yet ready to be merged into the trunk wants to resynchronise from the trunk and continue branched development. Merge of this branch into the trunk must be tested later after further changes
  4. Tagging
    1. Create a tag against a specific branch; probably at the HEAD but look for the option to specify a revision
    2. Correct / modify / move a tag following a mistake or last-minute revision (pre-release) without having to make duplicate commits or other such spurious activity
  5. Firefighting
    1. Screw-up: developer commits change to trunk instead of stable branch. Merge / move it to the stable instead – change can be left in the trunk or can be removed for re-merging, so long as the procedure is clear.
    2. Screw-up: developer commits change to stable branch that is interface-breaking, must be removed and moved to the trunk. Must be removed from the stable branch and moved.
    3. Screw-up: Revert a single change from the repository, that is not at the HEAD
  6. External patch submission tests:
    1. Patch file from same branch, no conflicts
    2. Patch file from same branch, with conflicts
    3. Patch file generated on a different branch to the one we want to apply it to (include conflicts)
    4. Pull from third-party repository, entire branch
    5. Pull from third-party respository, specific changes
    6. Patch file generated from non-repository source copy
  7. Backup multiple work-in-progress changes on a local machine that are not ready for public consumption; approaches:
    1. Store a patch per local branch (this is how it’s done with SVN, but too much hassle if you use lightweight local branching, DVCSs can do better)
    2. Push to a backup repository on another machine across existing protocols – ssh, https, Samba share (Git can’t do the latter?)
    3. Push to a backup respository on a USB stick (Git can’t do this?)
  8. Binary files
    1. Revise a binary file over a few versions, test storage efficiency
    2. Binary file conflict resolution
  9. Conversion from Subversion
    1. Import retaining history
    2. Import multiple branches
    3. Import tags
  10. Integration
    1. Mailing list / RSS notifications of commits on official repository
    2. Bugtrackers
    3. CIA.vc et al?
    4. Good free & open source GUI clients for all platforms
    5. Line ending conversions between platforms
  • http://zeuxcg.blogspot.com/ Arseny Kapoulkine

    When I played with Mercurial, I’ve encountered one issue (perhaps it’s in my workflow, perhaps it’s not – I don’t know, and I also don’t know what’s the behavior in Git). There is a file I work on, and I have some non-pushed changes in my local repo, as well as some modifications in working copy (both are crucial). Now the same file gets modified by a developer and pushed to central repo, and I want that fix so I try to pull it. Mercurial can’t merge anything with outstanding changes in working copy, so I have to either commit my changes (they’re full of hacks/temporary stuff so I don’t want to!) or remember them somewhere, revert, pull, reapply (this is bug prone). Last time I checked (a month ago) the situation was the same. I don’t get the same problem with SVN/p4 (of course I don’t get the luxury of lots of small local commits, but still). It would be great if you included this scenario in your tests, at least for me it’s a common one.

  • http://www.stevestreeting.com Steve

    Thanks, that’s a good one, I can see myself encountering that kind of situation too, as you say if you use the local commits to their full this situation is almost guaranteed to happen (equivalent to wanting to “svn update” while you still have uncommitted changes, except that you’re using local commits to track detail). I’m surprised it’s an issue, I’ll add it to my list.

  • http://sliceofmuffin.com blankthemuffin

    I think that has a few solutions. One you could use the “stash” feature of git ( and mercurial – with a plugin? ).

    http://www.kernel.org/pub/software/scm/git/docs/git-stash.html aragh man pages. :D

  • http://www.stevestreeting.com Steve

    Mercurial’s equivalent is ‘Shelve’ which sounds like the same thing. If that’s the recommended way to manage this case, then they’ll be pretty much the same. Seems odd that this process is more awkward than SVN though, which can just merge new changes in-place. It’s a very common thing too when you have many developers working at once. Strange.

  • http://zeuxcg.blogspot.com/ Arseny Kapoulkine

    re: shelve – that’s cool, back then there was no such extension :) this is the easiest solution in absence of native support.

    As for the reasons, I may be completely off, but it seems that these DVCS are architected in a way to only enable one type of merging – between branches. I’ve never written a DVCS so I don’t know what’s the impact of supporting both branch-branch and branch-wcopy.

  • WhiteKnight

    I eagerly await the result of these tests. I’ve read about DVCS, but have yet to actually use one.

  • http://www.stevestreeting.com Steve

    It’ll take a little time, since I’m experimenting outside of my ‘normal’ work (commercial and Ogre), but I will certainly release my findings.

  • CABAListic

    For what it’s worth, I believe that bazaar is not suffering from the mentioned issue, if I understood it correctly. At least when I used it I could pull changes from the central repository in and it would either merge with local changes on the fly or give me both versions of modified files and ask me to resolve the conflict manually. I was, however, using bazaar more like svn, so I’m just assuming that it retains this ability in a DVCS environment.

    I have no experience with either git or Mercurial, though, so I don’t know how it compares to them otherwise.

  • TheMuffin

    There’s a plugin to hg to work on git repositories losslessly.

    http://hg-git.github.com/

    While the other way around, there doesn’t seem to be any tools that can match hg-git. If you choose git, hg users can use the same repository.

  • jacmoe

    Allow me to share a bit of wxBlog with you:

    While Mercurial is as easy to use as it could be and has great documentation, Git is almost perversely complicated. It has concepts which are particular to it only (can anyone really explain what purpose does the index existence serve except for confusing new users and occasionally tripping more experienced ones?). Its included documentation is only useful if you already know very well what you are doing. It allows (I think it encourages, really) you to make errors — which is, of course, fine, as there are 3 or 4 different ways to undo them. Of which 2 (different ones, depending on situation) make things even worse. It seems to enjoy reusing commands commonly used in other VCS to do something different. Even the commands which seem to do what you’d expect (e.g. pull and push) do not. Moreover, they are not really even opposites of each other. So you never know what a command with a simple name does and you never risk finding any other commands without reading half a dozen of git tutorials. And even then you have to remember that the equivalent of hg histedit is git rebase -i (with rebase in general doing something completely different, of course). And using git means having one extra letter to type for every command compared to hg!

    Read the rest of the blog entry here:
    http://is.gd/2fO9m
    Entertaining. :)

  • Carsten

    Hi Steve,

    I am sure you know the both Google Tech Talks about Git

    Linus Torwalds speak about Git
    http://www.youtube.com/watch?v=4XpnKHJAok8

    Randal Schwarz speak about Git
    http://www.youtube.com/watch?v=8dhZ9BXQgc4

    Creative commons license book
    http://progit.org/book/

    I think git fits all your given points. It incredible powerful. I was using subversion for years but git replaced it completly now.

    Quotes
    =======
    Subversion is the most pointless project ever started — Linus Torvalds

  • http://www.stevestreeting.com Steve

    @Carsten: honestly, Linus’ comments in this area are very much a turn-off for me. I think his attacks on Subversion are hugely arrogant and unprofessional, not to mention completely inaccurate. SVN is a great system, hugely useful for massive numbers of people – it’s just not the one Linus wants so he considers it ‘worthless’. That’s typical egotistical, narrow-minded geek behaviour and doesn’t give me any confidence at all in a system he might design being right for me, if that’s his mindset. To be honest, I’m trying my hardest to *ignore* his attitude when reviewing Git, because to me it’s a negative, not a positive.

  • jacmoe

    I am also very put off by the elitist Git attitude, especially concerning Windows and UI interfaces in general.
    And the fact that they give commands used in other versioning systems a totally new meaning puts me off as well.
    I’ll probably look to Git again when that die-hard clique has died out. Until then, Mercurial it is.
    TortoiseHg is a really nice set of tools, working on Windows, the Mac (soon) and Gnome (Linux).
    If you know SVN, you know a lot of Mercurial already. They don’t try to be clever. :)

  • http://pop-3d.com tuan kuranes

    Nice ideas here on how to use svn/trac: UltimateQualityDevelopmentSystem
    http://divmod.org/trac/wiki/UltimateQualityDevelopmentSystem

    Maybe Ogre Team could switch to a read-only server project with some Trac gui, ticket list, timeline, mercurial, etc… with public update/mirrors (mercurial to SVN updates are easy).

    That could leverage the aposiblity to go on and step up to a “continous integration server” capabilities (test/reports/etc/)…

    Now that cmake is there, it should be easier. (ie: cdash is cmake based, but others have cmake abilities)

    Many tools can now be added to cmake easily (cmake scripts for unit test report, clang static analysis, gcc -Weffc++, memleak runs, valgrind, splint, PCLint, CPPCheck, Duplo, SourceMonitor, VC /analyze, vmware/virtualbox runs, etc. )

  • Pingback: SteveStreeting.com » Blog Archive » Adventures in conversionland

  • Pingback: SteveStreeting.com » Blog Archive » DVCS Score Card