VCS Evolution by Distribution

As already referred to in recent blog posts, and thanks to OUCS, some members of Sysdev, myself included, were fortunate enough to attend the UKUUG 2011 Spring Conference. Among the many talks one which particularly stood out for me was a talk given by Simon Wilkinson from the University of Edinburgh about the Git Distributed Version Control System (DVCS). Although this was a 90 minute introductory talk its level was well judged, introducing the fundamental topics that a user currently familiar with a Centralised VCS would need to understand to start using the Git VCS by stepping through a set of basic workflows, while explaining some of the different distributed development file versioning scenarios using directed graphs.

In case this is a bit unfamiliar to you, a VCS is invaluable to developers as it’s where all the shared resources for a project are stored for safekeeping. It records not only source code, configuration, and documentation, but also each and every change in those project resources, so changes can be tracked and reverted if necessary. It’s called ‘Version Control’ because it allows, and promotes, application of your chosen versioning scheme to files. It’s also sometimes called ‘Source Control’, because it is commonly used by software developers to manage changes in their source code. This blog post considers the VCS from the multiuser point-of-view but a VCS can also be invaluable for a single-developer project.

My own experience with Version Control traces a path through RCS, CVS, and Subversion, each of which build on the previous to develop the sense of working with a single shared, centralised repository. In this sense they (particularly CVS and Subversion) are Centralised VCSs. Each user working with a Centralised VCS typically checks out a particular current set of files from the central repository before making changes. After making changes locally the user then accesses the shared repository across the network for VCS interactions such as comparisons, commits, updates, etc. For any activity that needs access to the central repository the user is competing with others for access to the same shared resource. Since the resource is shared it carries with it all the baggage and overheads of locking to prevent leaving the resources in an inconsistent state.

Git has a handy solution to this: when you ‘check out’ a project repository using Git you are actually ‘cloning’ it – you get a backup of the repository, lock, stock and barrel, including all the history. This neatly side-steps some of the contention and locking overheads, achieving some noticeable performance gains since it makes many operations local which would, with Subversion, have been remote repository interactions. This is one way in which Git achieves some significant performance gains but this is not the only difference. There are other features that Git includes to make VCS users happier that seem to have been missing from Centralised VCSs, such as squashed commits, simpler and less resource-heavy branching, and simpler repository file formats. Yet another benefit that Git brings to the table is that equivalent repositories are much smaller than other VCSs.

Sysdev currently run a Subversion VCS service for use by OUCS staff and OUCS-run projects. We also count ourselves among the users of this service, since it is used by our home-grown configuration management system rb3, as well as for the projects undertaken by Sysdev staff. Subversion was originally designed as a largely compatible replacement for its predecessor, CVS, and as such, it shares much in common with CVS syntax. In turn, there is some familiarity to Subversion users in the syntax that the Git developers have chosen for the Git toolset. Tools are available to migrate repositories from Subversion to Git (git-svn), just as tools were developed for migration from CVS to Subversion (cvs2svn). However, apparently Git’s design goals were not primarily to be a direct Subversion replacement, and when you are used to a Centralised VCS using a DVCS can take a bit of getting used to. I’m no expert Git user, but as a long-standing user of other VCSs I believe that Git has speed and efficiency improvements, along with other convenience benefits, that are well worth consideration by our users. Naturally, for a team such as ours to consider providing a DVCS service such as Git the demand for it would have to be expressed by our users, and assessed.

Git is an example of a project that was born from Linux kernel developers wanting a better tool, and as such it fits well in a UKUUG Conference, but at its core Git has much wider appeal to developers in general. This seems to me to be typical of the UKUUG Conference. If you’re not part of a UNIX team you may think that a conference organised by a UNIX users group is too specialised to be relevant, and perhaps not really your thing. If you think along those lines I’d like to suggest some reconsideration as these are useful events, with talks likely to be relevant and of interest to a broad range of IT professionals. As if to highlight that a recent announcement was made: the UKUUG have changed their name to FLOSS UK, which stands for Free and Libre Open Source Software, the name change reflecting a broader scope and a correspondingly wider appeal.

Posted in Uncategorized | Comments Off on VCS Evolution by Distribution

Comments are closed.