There are an unbelievable number of version control systems out there.  They all have their strengths and weaknesses, and almost everyone has a strong opinion about which one is best.  These software tools weren’t written by some group of developers that were out of touch with their end users either.  They are all developer tools, written by developers for developers.  Many of them were born from previous version control packages, such as how SVN evolved from CVS.  The purpose was to keep what was good about the previous system and redo what they felt was lacking.

At this point, we’re pretty far along in the evolution of version control development. However, there’s a fundamental problem with all of these version control systems.  Even though they set out to do the pretty much the same thing, there’s no common interface to any of them.  Even worse, the interpretation of common concepts are completely different.  It’s not noticeable if you’ve only used one.  The more you use, though, the more obvious it becomes.  The only attempt at such a thing that I can recall is Microsoft’s SCC interface.  Unfortunately, it served a very specific purpose and aimed for the lowest common denominator.

If you have ever tried to switch from one version control system to another, you have suffered through this.  If you’ve ever had to port the version history from one system to another, then you likely gave up and stuck with the old system or lost all of your history.  The interoperability of version control systems is horrendous.

Standardized interpretation

Let’s start with version numbers, the ID that identifies a particular version of a file… how basic is that?  CVS has it’s own dot notation.  SVN and Team Server keep a global number that gets incremented when any file is checked in.  This difference in interpretation leads to fundamental differences in how these systems implement labeling and branching.

How could these common concepts be interpreted so differently? It means that in order to move from one version control system to another, you not only have to learn the new commands, but you must understand that system’s interpretation of what they mean.  Labeling and branching in CVS and SVN are very different, even though the basic concept is the same.  It would be nice to see some a standardization of the implementation of these concepts between version control systems.

Standardized commands

Imagine a world where databases each had their own proprietary language that was completely unrecognized by any other database.  Alright, maybe that’s not so far off from reality, but beyond PL/SQL, T-Sql and others, there’s ANSI SQL, which is the common language of all relational databases.  It theoretically (note the italics) allowed developers or DBAs to write SQL that could be understood by any database that supported ANSI SQL.  It’s the idea of polymorphism at work outside the code, creating a common interface to the functionality, yet allowing each database engine implement that functionality as they wanted.

Version control would be such a happier place if there was a common command language for it, that each version control system was required to implement. On a ground floor level, it would make the simple day-to-day operations consistent.  Update, commit, compare and revert would all be standard.  Once you learned how to use one version control system, you knew how to use ten version control systems.  Is this so hard?

Import/export standardized formats

A huge point of pain in version control is having to switch from one system to another.  It sounds crazy… who would switch version control systems at the same company?  However, it’s happened to me three times in my career.  Companies and departments merge, where both sides used different version control systems.  Sometimes your needs require a version control system with more functionality.  There are lots of reasons for switching systems.

Unfortunately, each time I’ve had to switch version control systems, it’s resulted in a complete loss of history. Therefore, there needs to be a standardized import/export format.  Many applications make use of XML as a way to import and export their data to share with other applications.  RSS is probably one of the best examples of this.  Version XML would allow you to export your version history for all files in the repository.  That history, as well as the files or deltas, could then be archived in tar or zip format into a single file.  The corresponding import functionality would be able to read this standard format and structure, and would be able to import that version history. This would completely remove the pain of having to switch from one version control system to another.

Version Query Language

Basic file check in and check out is the core of any version control system.  However, aside from being able to diff two specific versions of a file there is very little cross version analysis in most systems, despite the fact that the source control repository inherently contains this information. Going back to the SQL analogy used before, a version control repository is a database and users should be able to query it.

I would love to see a standard version query language and table structure that allowed you to run any query you wanted through the version control system’s engine.  You might be able to write queries like the following (pseudo-queries only):

Who wrote the code at a specific location in a source code file?
select author where file = “x” and line = y
where x is the file and y is the file line number.

Across all files, which version’s check in comments contain a specific string pattern?
select revision_number, file where comment like “x”
where x is the check in comment to look for.

What check ins has a particular person made in the last 10 days?
select file, revision_number where author = “x” and checkin_date >= y
where x is the user and y is the date 10 days ago.

Which check ins had the biggest effect on a particular file?
select top 5 revision_number where file = “x” order by lines_changed desc
where x is the file.

We took a stab at that in our Tools for Visual Studio product with the Find Version feature.  It allows you to find the versions of one or more files that match specific criteria, such as the ones above.  It does so with a GUI interface, not a query language like I proposed, though.  The most difficult part about this feature was finding a way to do this across several different version control systems, yet maintaining a consistent interface.

In an industry like software development, where most people understand the concept of one interface and many implementations, it’s amazing to me that this hasn’t caught on yet in the area of version control.  Maybe some day that will happen, and moving to a new version control system won’t waste time that could otherwise be used writing code.