Eyes Above The Waves

Robert O'Callahan. Christian. Repatriate Kiwi. Hacker.

Tuesday 18 January 2005

Forestry


Over the years I've been hacking on Mozilla, I've wasted a lot of time grappling with the limitations of CVS. There are a few main issues for me:


  • Frequently I need to collect all the changes I've made to a tree into a patch file, by diffing my tree against the trunk. CVS contacts the server and takes a long time.
  • CVS updates are slow.
  • Managing many trees is painful, especially creating a tree with an initial checkout (very slow), building it, and keeping the trees up to date. This means that I tend to use only a few trees, usually just one. This is bad because various pieces of work get mixed up in a tree, and my patches or even checkins sometimes contain fragments from logically separate changes. Ugh.


Since Mozilla is my full time job now, I'm investing some energy in tackling these problems. I looked at various "CVS replacement" programs, especially the version control systems that claim to interoperate with CVS and provide local branching. But for various reasons they're not suitable; the main reason is that none of them have been proven on codebases as large as Mozilla.

So I've decided to use rsync to maintain a mirror of the entire Mozilla CVS repository on my local machine. Checking out and updating many trees from it is very fast. Likewise, doing CVS diffs is now a local and very fast operation. But how costly is it to maintain a synchronized copy of the entire repository, given that it records all the changes ever made to 5 million lines of code over six years? Surprisingly, it's not bad at all. It's using about 2.6GB of disk space. It took about 30 minutes to pull down the first time, using rsync with compression. My office Internet link is pretty good, but I am in New Zealand. It takes about 20 seconds to resynchronize on a day when there aren't many checkins. Altogether I'm extremely pleased. My only fear is that a lot of people will start doing it and mozilla.org will have to restrict rsync access...

Another thing I'm doing is using ccache to speed up builds and share build products across trees. Since in most trees only a few files are modified, building in each tree will mostly produce the same object files. ccache does a good job of recognizing when the same file is being compiled in different places and just copying the object file from its cache into the output directory. In fact you can do better just by hardlinking the file from the cache to the output, so you only have one copy of the output file shared by all your trees. There is one problem with ccache currently that I believe I have fixed... I'll write about it soon, once I've verified that my fix works.


Comments

Jeff Walden
I recently looked into subversion with respect to the Mozilla codebase (via irc logs, mostly). There are some people who'd love to switch over to svn for a variety of reasons. Some of the bigger reasons are to version directories, to make files easily movable without repository hacking, and to enable diffs that encompass new files without requiring commit access to source (not quite sure whether the last is really true or not, but from what I read I'd put the probability of that being true at at least 90%).
Interestingly, I never ran across the "unproven" argument for it, although I wasn't looking particularly hard. The arguments I found were primarily related to the inflexibility of Mozilla's ancillary tools: Bonsai, LXR, and (maybe?) tinderbox. From what I know, no one's taken the time to add svn-compatibility to any of them, and until then we aren't likely to see svn in use with Mozilla.
Sometime soon I intend to ask about this on IRC, although I'm not sure when. I'd love to help as I can, but unfortunately I doubt I have the expertise needed to make webtools work with svn (and I know I don't have the time when I have other Mozilla-related things to fix).
Adam
Hi Rob,
Here we use CM Synergy which is a product from Telelogic. Costs us a fortune, and although the DB is good, the front end gui is a pile of goo, and it's usability is an abomination.
It's pretty good at branching and going parallel though, but reporting on it is terrible.
We were looking at Rational Clearcase but it's too expensive. We were looking at the new OS Subversion. What are your thoughts on that?
I'd love to move to something OS to drop our licencing fees. Only problem is that the Suits fear it and want their 0900 support #.
Robert O'Callahan
Right, getting svn to work with Mozilla webtools would be hard.
'Unproven' has a few facets. Last I checked, the CVS-to-svn conversion tools used tons of memory on relatively small repositories, and it wasn't at all clear they would work reasonably on the Mozilla repository. That would be an interesting experiment to try. Then of course there's the question of how things scale to Mozilla's size and number of users once you're up and running.
And of course you have to consider whether svn is the right thing to switch to, if you're going to invest a lot of energy in switching. Personally I think we really need the ability to have totally local branch management, as I mentioned, and plain svn doesn't provide that.
Robert O'Callahan
Adam, you might be able to buy Subversion support from someone.
Bitkeeper might be good to look at. You have to pay for it for commercial use, but the people are open-source friendly and I like the distributed model they have.
Perforce is the commercial thing that a lot of people use and seem to like.