So, I’ve just about completed my practical experiments & review of Mercurial and Git.
In the end, I had far too many separate notes and sets of experiences to post, so I boiled the argument down into the 10 most important factors to me, and scored Mercurial and Git on a scale of 1-5 based on what I’d found when using them. Here are the (annoying) results:
| # | Criterion | Git | Hg |
| 1 | Ease of use – command line | 4 | 5 |
| 2 | Ease of use – GUI | 4 | 4 |
| 3 | Platform support – core | 3 | 5 |
| 4 | Platform support – GUI | 4 | 4 |
| 5 | Web Host Functionality | 5 | 4 |
| 6 | Reliability & error handling | 3 | 5 |
| 7 | Storage efficiency | 5 | 3 |
| 8 | Run-time performance | 5 | 5 |
| 9 | Flexibility | 5 | 4 |
| 10 | OGRE Community support | 5 | 4 |
| Totals | 43 | 43 |
I’ll explain the scores, and my conclusion, after the jump.
1. Ease of use – command line
This criterion boils down to how easy is it to learn to do all the operations required of a typical developer, many of which I’ve listed in a separate thread, how consistent and intuitive those commands are, how natural the defaults are, and how easy it is to screw something up by accident. This is by nature a subjective measure. While Git improved greatly in my perception during the course of the evaluation, there is absolutely no doubt that the command-line of git takes more time to learn to operate effectively than Mercurial does. I deliberately learned Git first to avoid being biased by a command set that was closer to what I was used to, and grew to be happy with most of the Git commands, but it took me longer and I had to refer to the help and Google for things more often. The included help in both is good, but git has a lot more options (hence the win on flexibility later) and lots more discussion on edge cases and how the physical repository implementation is affected, which may be interesting technically but can cloud the issue when you’re trying to learn how to do something. Unusual terminology also confuses people from other systems – “git reset –hard HEAD” is simply not as intuitive as “hg revert –all” for the migrators. I felt more comfortable with Mercurial’s command-line faster than I did in Git, despite learning Git first. I did, however, feel that I could use Git fairly happily after going through the learning process, and my attitude towards it moderated after some experience – this is why there is only a single point difference between them – if you’re at the start of the learning process Git is probably a 3 rather than a 4.
2. Ease of use – GUI
Linux users typically care less about this than those on Windows and Mac, and since I rarely use Linux on the desktop I’ve concentrated on Windows and Mac here. I tried TortoiseGit, GitX, TortoiseHg and Murky (SmartGit is another but it wasn’t considered stable when I was looking). In practice I found both systems to have shortcomings; on Mac the GUIs are still quite limited in functionality, and on Windows there are some things that work better in TortoiseGit (it feels like other Tortoises so initial impressions are superior to TortoiseHg which is a different layout), but TGit also has some bugs (such as not listing all branches in Switch if there’s more than about 20) and relies on msysGit which is not ideal especially when you need to abort an action and it leaves a git.exe lying around locking files. In practice they’re all about equal in their imperfections, no clear winner. I’m sure they’ll both improve over time but both are usable now, with users needing to drop to the command line for more complex tasks (see point 1).
3. Platform support – core
This is about whether the core toolset runs well across at least Windows, Linux and Mac OS X. This covers not only the ability to run on each of the platforms consistently, but also to exchange data between the platforms without issues and deal with line ending differences etc. Git came of worst here, because it is inherently designed to require a Unix-like environment, built as it is on common Unix tools like the shell and perl. On Windows it requires msysGit which in most cases works (I tested with 1.6.4 and 1.6.5) but is both significantly slower than under Linux – something you only notice for large operations but it’s up to a factor of about 6 in my tests – and still has bugs. git-svn for example is completely unusable on Windows in my experience against non-trivial repositories – the Ogre repository killed it consistently. Also the core Git developers openly admit that they don’t care much about Windows support, so commitment to the platform isn’t there. Mercurial on the other hand runs on Python and in my tests operated identically on all platforms, and is officially supported wherever Python runs. Both systems handled automatically converting line endings and did a better job than Subversion (which requires properties per file).
4. Platform support – GUI
So does either system support the platform range better than the other via the GUI? Not really – as mentioned in the ease of use section they’re both pretty good but still a little flawed in places. On Windows you still require msysGit even with a GUI which makes Git a little slower than on other platforms, but TortoiseGit is surprisingly good considering the vibe of general disinterest in Windows around Git. GitX and Murky are competent if limited on Mac OS X. Both systems have a core Tcl/Tk interface if you really need it, but honestly anyone in the slightest bit sensitive to nice GUI design won’t want to touch them with a 20 foot barge pole, they’re ugly - Tcl/Tk makes Windows 3.0 look shiny in comparison. No clear winner here.
5. Web Host Functionality
A few online services have sprung up to support the DVCS workflows better than simple hosting on Sourceforge or Google Code. Everyone seems to talk about GitHub all the time, but while it’s very pretty, functionally it’s soundly eclipsed by Gitorious which has considerably more robust support for dealing with merge / pull requests, handling them much like a patch tracker but with repository-aware, multi-revision URLs and inline reviewing instead of patch files. They also allow integration of contributor license agreements. GitHub in contrast only allows fire-and-forget pull requests or generalised browsing of commits people have made to other forks, which is not as useful.
BitBucket (for Mercurial) is functionally pretty much the same as GitHub (including the fire-and-forget merge/pull requests and general fork lists), except that it’s not quite as flashy. If we were comparing GitHub and BitBucket, the scores would be the same since aesthetics are not very important in the grand scheme of things, but Gitorious ups the ante, which is why Git wins in this category. I’ve actually talked to the guys at BitBucket, who are extremely friendly and eager to help, and Gitorious-style merge requests are on their TODO lists. They even offered to bump it up their priority lists if it was important to us. Very helpful chaps, but Gitorious still has to win based on the current status.
As an aside, it’s also worth noting that Launchpad is also extremely good in the merge request tracking area, on par with Gitorious. I dropped Bazaar from my evaluation due to lack of time and because it’s the least popular of the 3 in our community by a massive margin (and also, I don’t like the branch containment model they use very much), but Holger played with it and it’s clear that Launchpad deserves a mention for nailing this feature set very well. GitHub may be the poster child, but functionally others deserve more attention.
6. Reliability and Error Handling
As a wise man once said, sh*t happens. How easy a system is to break, and how it behaves when things go wrong is just as important as how well it works under normal operating circumstances, especially for something as critical as a source control system. I didn’t specifically go out of my way to cause problems, but during my many use cases I did encounter some sticking points, which was precisely the point.
Q. How often did problems occur? A. On Mercurial, I never had a crash, on any platform, that I didn’t accidentally cause myself. The two crash incidents I had were during conversion from Subversion, and were caused by firstly an rsync kicking in and changing the source Subversion file system under the feet of the conversion process, and secondly when I killed the process manually because I wanted to interrupt it. So in essence, I have yet to see Mercurial fail unless I break it. On Git, I normal operations behaved fine but during conversion from Subversion I had many problems. Git 1.6.4 and 1.6.5 on Windows regularly crashed mid-conversion, as did Git 1.5 on Linux. Git 1.6.5 on Linux behaved better, but only if you upgraded the (admittedly old) Subversion 1.3 repository to 1.5 or 1.6 first. Mercurial on the other hand seemed to cope with any combination I threw at it, on any platform.
Q. How good was the error reporting, and how easy was it to recover? On Mercurial, when I did get a crash (self-inflicted) I got a full Python stack trace with an exception message which was consistently useful, allowing me to quickly rectify the issue. The repository was also valid even in those cases. On Git, all the crashes I had on Windows and Linux simply resulted in the process terminating with no message. I only managed to figure out how to resolve the problems through trial and error, Git was absolutely no help. The repository left behind after these crashes was corrupt.
So, my personal experience was that Mercurial was very robust, and in the rare case of a problem it reported it well. Git was ok most of the time, but some operations were fragile and for example only a very specific version & platform worked for converting the OGRE repository. When Git did fail, my experience was that it didn’t report any useful errors and it basically left you high & dry, scrabbling on the net for answers. Mercurial wins this one outright based on my experiences.
7. Storage Efficiency
A simple one to measure – after converting the 375MB OGRE repository to both systems, and before any custom pruning, Mercurial was about 200Mb and Git about 180Mb. A manual pruning operation by community member guyver6 brought the Git repository down to 116Mb; after pruning out branches in Mercurial I only managed to remove about 7Mb. It appears that the primary reason for that is that moved binaries end up getting stored twice in Mercurial, while Git only stores them once once the data has been packed. Mercurial always packs its data as you operate on the repository, while Git lets its storage get sub-optimal in size while you’re working on it in order to give you maximum run-time performance, and ‘git gc’ needs to be run every so often (some commands do this automatically) to re-pack the data for storage efficiency; which is best depends on your point of view, whether you prefer a uniform behaviour or a split behaviour. But overall, Git wins here. In OGRE we actually have a number of moved binaries in our history which Mercurial clearly does not store as efficiently as Git does.
8. Run-time performance
This was a bit of a mixed bag. I found that Git on Windows was a poor performer on local batch operations that were not constrained by the network, compared to Mercurial. On Linux or OS X performing local operations, performance was practically indistinguishable between the two. Bulk operations that did require network access were a little faster using Git, but not by much. When it came down to everyday operations, the slightly slower msysGit for local operations, and the slightly slower network performance of Mercurial, were barely perceptible. A wash, both systems are fine.
9. Flexibility
When you need to do unusual activity X, can you do it? In Git, the answer is almost always ‘yes’ – it has an enormous number of commands and options and doesn’t really stop you from doing anything, even if it’s a bad idea. Mercurial on the other hand defaults to being quite strict, but there are a number of extensions, both official and unofficial, that can bring the functionality fairly close to Git, but not all the way. A few examples:
Local branches: this is a very useful feature of Git, where you can create lightweight branches in your local repository that you can use for experiments or patch processing without having them become a permanent part of the upstream history. Mercurial branches are all permanent by default, unless you use the LocalBranch extension which is not official. You can replicate the behaviour to a degree with Queues, which is official, but it’s more complicated. Git is better here.
History Modification: changing history is a very bad idea if you’re upstream of anyone else, but in a local private repository it can sometimes be useful. Git provides features such as rebase –interactive in which you can squash together and reorder commits to reorganise them before upstream submission, and filter-branch to make wholesale changes to the history, for example post-conversion to simplify the repository. Mercurial has basic support for some history modification (MQ again, and unofficial extensions like histedit), but they are not as flexible. Most of the time this isn’t an issue, but occasionally it can be limiting – for example I have not so far found a way to remove history before a certain date (or collapse revisions together before a certain date) – the unofficial histedit and collapse extensions do not work for this and MQ won’t let me import regions for qfolding that exist before branches are taken (which is what I want to do – I need my more recent branch history, I don’t need the old stuff). I don’t understand this restriction, I’ve already stripped all the early branches so the early history is entirely linear, why should it care that there’s a branch taken later on?
So, Git wins here. In everyday use you won’t care about this, which is why there’s only a single point between the scores when otherwise Git might deserve a 2-point lead here, but certainly when you’re doing uncommon things Mercurial puts more barriers in your way. In day-to-day operations that’s probably a good thing, since it encourages you not to do stupid things. But when you have a specific need to do something for a very good but rare reason, it’s annoying when you can’t.
10. OGRE Community Support
We ran a survey on this to see what people were using already. Git tends to get a lot of fans talking about it, but I’m also very aware that evangelists aren’t usually the best people to listen to. By asking people what they used practically, rather than which one they thought they might like to use, I hoped to tease out usage numbers. Of course, popularity is no guaranteed measure of quality, but it’s a reasonable indication of how each system might be received by our community. As it turned out, and not unexpectedly, most people had only seriously used one of the DVCSs and liked the one they were using, but had no real view on any of the others.
The sample wasn’t huge – only 64 people voted (but pleasingly a power of 2!) – and the numbers were as follows: Git 52% Mercurial 41% (Bazaar 8%). This nicely translates objectively into a score!
Conclusion
Well, this is annoying. My 10 criteria actually resulted in equal scores – trust me, I didn’t fake this; I thought very hard about these scores because I found myself being indecisive between the two systems because they both had positive and negative aspects, and I figured the only way to resolve the overall result was to try to score them and let math solve the problem. So much for that idea – it seems when I set them out numerically they are as balanced as they were more abstractly in my head.
So in the end, it comes down to the relative importance of these 10 items. I tried to pick 10 things that were of roughly equal importance to avoid skewing anything, but if really pushed I’d have to say that consistency across platforms and confidence in the reliability and error reporting has to be more important to me personally than most of the other factors. The one exception is the storage size – I want people to clone the source repository so they are encouraged to get involved in development, and I’m aware that the larger it is, the more that’s a disincentive. 200MB is pushing it a bit, and it’ll only get larger – and according to Mercurial’s specs that’s already compressed. In comparison, right now according to my measurements someone grabbing our Subversion repository has to transfer about 48MB of data (compressed) per branch over the network, so Git’s 116Mb (again, compressed) is looking very attractive compared to Mercurial’s heft.
I think that if I can find a way to reduce the size of the Mercurial repository to around 100MB, perhaps by stripping the old trunk history somehow (stripping old branches doesn’t appear to have made a great deal of difference), but while still keeping branches after this point, I’d go with Mercurial just because on balance it behaved more consistently for me. If I can’t, I’m still annoyingly on the fence becuase 200Mb feels too big, but I can’t afford to trash all my history or branches. There is lots of talk in the Mercurial wiki about shallow clones and potential history trimming extensions, but nothing seems solid right now. Anyone have any suggestions?









November 6th, 2009 at 7:24 pm
Impressed as always by the thoroughness of all this. A couple of random drive-by points:
1. I don’t use the GUI much, but when I do it’s git-cola, which I like very much. It’s QT-based, so it ought to be pleasant on all platforms.
2. I have almost no experience with GitHub but I assumed it was better than Gitorious because people go on about it so much. I’ve used Gitorious extensively for the StatusNet project, and it’s pretty good. Functionality aside, GitHub – ‘evil’ closed source. Gitorious – free software. Most people, oddly, don’t seem to care, but I wouldn’t lock my stuff into someone else’s platform with no means of escape.
November 6th, 2009 at 7:32 pm
I don’t know how well Git handles binary files (Media), but Mercurial doesn’t do that very well, leading to a larger size repository.
SVN, on the other hand, is superior for binary media, because it do binary deltas very well.
I think you could shave off some weight by letting the media be external (just like the dependencies).
Just an idea.
November 6th, 2009 at 9:36 pm
I haven’t worked with either system, or at least not to a degree that is worth being mentioned. I have no preference myself when you switch to a DVCS, I have to learn one of these then anyway, so I don’t care.
Just from your post and following the discussion in your blog and in the forum, my choice were Mercurial.
The most important aspect to me is reliability. Hg wins this one clearly for the reasons you outlined. Being able to understand the system itself is a huge point.
Were as storage efficiency is something minor. It is not as if there are worlds between the two. Hosting
So the 3/5 in reliability outweighs the 5/3 in storage efficiency.
All else being treated equal (to me it more or less is, because I am indifferent there) this makes Mercurial the winner.
November 6th, 2009 at 10:16 pm
I’ll echo haffax’s sentiment in that to me there’s literally nothing worse than having my source control mess up and fail, especially if there’s a risk of a repository corruption. That mercurial has only failed through no fault of it’s own and has never corrupted the repository yet git has done both is a monumental difference in my mind.
If I was previously biased towards mercurial because its command line is significantly easier to use, I am now a hard-core, mercurial fanboy! Give me stability and fault tolerant repositories over a smaller checkout size any day of the week.
November 6th, 2009 at 10:55 pm
I think jacmoe’s idea of externalizing the media files might not be such a bad one. I always thought it was a bit weird that I was getting 38 MB of media files when all I wanted was just an SDK. Making the samples + media files into a separate package sounds like it makes sense, and now that the samples framework has gotten an overhaul, the timing seems right too.
November 6th, 2009 at 11:06 pm
As I told on forum, just use the coin
November 6th, 2009 at 11:22 pm
On the other hand, the media files are already in the repository – pruning out of the history would seem to be a bad idea.
My understanding was that the only Git crashes which caused corruption were Git-SVN import related? I’m willing to hazard a guess that Git-SVN is lazy about the updates it performs to the repository – after all, the repository is being created at this time, and if it crashes, you loose no data.
Also, Git-SVN has been getting a lot of updates as of yet because, yes, there are edge cases (particularly around large repositories) which were discovered during the GNOME switchover, and the planning for the KDE switchover. Git-SVN is, fortunately though, the one tool you will only ever have to touch once.
Considering the major projects using Git these days – of course the Linux kernel, GNOME, Qt and soon GNOME, I’d expect that the core features are getting a lot of use, bug detection and fixing – after all, I doubt Trolltech/Qt Software/Nokia (Would they decide on one brand?!) would choose something they found buggy on their 900MB (well packed), 170k commit library!
November 6th, 2009 at 11:28 pm
Owen: Big projects are using Mercurial as well.
Do you think Mozilla and OpenSolaris, Google and Python would have chosen an inferior solution?
The problem with binary files in Mercurial, and most probably Git as well, is that it doesn’t do binary diffs that well, if at all. And the result is a rapidly growing repository. If the files changes a lot.
It means that it treats a modified binary file as wholly changed. SVN is the best solution for storing binary assets under version control.
November 6th, 2009 at 11:35 pm
Does it help to rank the users of the repository by how much they will use the obscure features? Maintainer(s), core developers, contributing developers, interested bystanders.
I’m not sure surveying the community on DVCS usage was the right measure at least not one that I would have weighted highly. I would have instead asked for experiences maintaining large projects using them.
I have to agree with Eric agreeing with haffax. Mecurial just sounds so much better behaved and even when the rest of the world isn’t it gave you good guidance on the problem. Failsafe vs. fail.
November 7th, 2009 at 7:12 am
I really think media need to be in the repository – I see Google Summer of code project add quite a lot of media files and how are you (public) going to synch and run it? Upload the media 2 times a week? Come on…
Anyway, my vote to mecurial for the same reasons stated by others.
November 7th, 2009 at 8:17 am
I suggested that the media be put in SVN, because SVN is superior for binary media. And it was only a suggestion for how to shave off some weight. And the media really isn’t updated that often.
November 7th, 2009 at 11:42 am
The removal of media idea has 2 problems: firstly that it won’t affect the size of the repository unless you can purge the media entirely from the history, which I haven’t found a way to do in Mercurial yet, and secondly that although you decrease the size of the repository (which is good for encouraging people to grab it), you make the process of getting everything you need more complicated (which is bad for encouraging people to get involved). I think the inconvenience of having 2 repositories (and tools) nullifies any size reduction benefit.
As I say, the primary difference between Mercurial and Git seems to be that Mercurial stores new copies of binaries when you move them, whilst Git identifies that the hash is the same and doesn’t store it again. In our (distant) history media has been moved a couple of times which is likely a major cause of the problem. If I can figure out how to prune old history I think it will make a major difference. I’m thinking of trying to go via Git just for the history pruning then go back to Mercurial for the final.
The other bugbear I have with Mercurial is that is has no progress meter on remote operations. When you’re clonin / pushing large amounts of data you have no indication of how long it will take or even whether it’s still working; Git however has a nice % completion and K/s report which gives you confidence it’s all ok. With a big Mercurial repo I’m concerned people will kill the clone, thinking it’s not working.
But yes, overall I have to say that I agree with haffax & Eric, having confidence in your tools is very important. Yes, Git only failed when I was using git-svn, which I won’t need to use after the transition. But, the fact that when it did fail it wasn’t at all helpful in reporting why gives me pause. It doesn’t matter if a tool fails rarely – when it’s a critical tool, its behaviour under fail conditions is very, very significant. Mercurial was very transparent – not only do you get a full stack trace and exception message, since Mercurial is all Python I found it considerably easier to investigate. Hell, I even felt comfortable enough to enhance a Mercurial extension (hgsubversion) to handle merges the way I wanted. In contrast Git was a lot more opaque; even with all the theory that’s packed into their documentation it didn’t really assist me in where I should start looking when git-svn fell over; I just had to randomly experiment. So on a gut level I feel happier with Mercurial than Git.
I know that lots of big projects are using Git. I also know that the vast majority of them are Linux-oriented projects (even Qt/Nokia are very Linux/KDE focussed – the majority of the devs at Qt Dev Days were running Linux on their MacBooks!). Being a Linux project implies a certain affinity with the innards of Unix-like systems and, dare I say, a higher than average tolerance for complexity over intuitiveness, which is the slot that Git fits in too. Linux and Git were definitely cut from the same cloth – they’re both extremely clever, very technical pieces of work, but they also require more of the user, particularly in ’sticky’ situations.
So for the same reason that I prefer my Mac over a desktop Linux, my preference is for Mercurial, although my appreciation for Git is far, far greater than it was at the start of this review. Git’s a superb tool – it just has a few too many Linuxy traits for mere mortals like me to prefer it over the more intuitive Mercurial. I just really, really want to get this repo size down a bit first.
November 7th, 2009 at 5:48 pm
If you want to know that Mercurial is working, and not hung, pass –debug to clone/push/pull – they are aware of the problem.
November 7th, 2009 at 7:10 pm
I haven’t read it yet (but I searched through it) and just have two quick comments:
I’m sure the ten parts of the final score aren’t equally valuable to you, so adding a weight score to the table might bring out more useful values to go on.
Also, bear in mind that anything that isn’t bad by design, but bad due to bugs, not having enough developer time/love yet, etc, is bound to be fixed sooner or later. Since OGRE will be worked on for years to come, considering future usage is a good idea.
Probably obvious stuff, but thought I’d bring it up anyways
Going to read the post now.
November 8th, 2009 at 2:45 am
Thanks for all this Steve incredibly interesting and useful. We’re using SVN at work but I’ve been pondering moving to something else for a new development platform because of its poor merging. Well, I think it’s poor but it may just be my poor understanding based on years of using ClearCase.
November 8th, 2009 at 8:35 am
Thanks for your level-headed comparison there, Steve. Very informative.
For your repo size problem I can only propose to use “hg convert” with the filemap option to get rid of those big files. Maybe you can get the size down by using the branch-sort option, too.
November 9th, 2009 at 4:15 pm
@Anders: yeah, I actually tried to pick criteria that were of approximately the same importance to me to make the numerical approach sound, but as I say if pushed I would favour equal platform support and ‘confidence’, which is why I’m coming down on Mercurial’s side.
As for design vs bugs, this also supports Mercurial over Git for me. That’s not to say Git has a bad (technical) design, clearly it doesn’t, but Mercurial’s overall approach also sits a little more comfortably with me; cross-platform core, re-using existing knowledge rather than reinventing, being simple & intuitive to use wherever possible.
@Face: I’m actually using hgsubversion, because the results are faster and a little better for me (I even modified it so that merges can be ready on conversion), and you can incrementally pull revisions from SVN afterwards. It has a filemap option too – I haven’t managed to get it to do what I want yet.
November 10th, 2009 at 4:09 am
When comparing repository sizes, there are two numbers I pay attention to in Windows-land. The first is the size in bytes of the repository. The second is size on disk, which tends to be much larger in a Mercurial repository, as the Mercurial repository stores each of its files individually. This is contrasted with a big pack file in Git (and Bazaar, of which I am impressed with their 2.0 release).
What definition of repository size do you use?
November 10th, 2009 at 11:21 am
It’s the size in bytes – I’m well aware of the cluster size issues that make the size on disk larger – the ‘on disk’ figure is 228MB for Mercurial as opposed to 207MB for the actual data.
As an experiment to examine how well compressed things were already, I tarrred and bz2′ed both repositories. I found that both hardly changed at all, suggesting they’re both well compressed already – Git shrunk from 116MB to just 113MB, and Mercurial from 207MB to 200MB.
So it’s not compression, and that also suggests there won’t be any significant further gains during compressed network transfer. As I say, looking inside the filesystems of the repositories the main issue appears to be that Mercurial stores data in a filewise fashion so moved binaries are less efficient than in Git where a hashed binary is stored once regardless of how many folders it has existed in over the history. Generally, historical filesystem reorganisation is less efficient in Mercurial because of the filewise storage, it just shows up more obviously in binaries. Hopefully they might find a way to handle this more efficiently in the future, or at least give us a tool for stripping off old history before a certain date / revision. In the meantime I think I may have to go via Git just to strip off history, which is not ideal.
November 24th, 2009 at 8:24 pm
[...] already posted about my experiences with Git and Mercurial, the end result of which was a vastly increased respect [...]
March 16th, 2010 at 12:31 pm
For completeness, after some work we managed to get the Mercurial repository down to under 100MB and therefore decided to go with it.