30 July 2008

Backups Reorganization pt. 5: SCM

There are two Source Control Management (SCM) systems in use by the client: Subversion and Mercurial.

Subversion
Subversion is a client/server based SCM with a central repository to which clients synchronize. There is a built-in tool available called "svnadmin hotcopy" which, as the "hotcopy" parameter suggests, guarantees that I'll get a consistent, restorable backup of a repository.

Great. Now the problem is, where are the repositories? There are dozens of repositories that need to be dumped and I don't know what they are or where they are located in the root file system. The client creates repositories on the fly wherever they want. That's just the way they do their work. I had to come up with some scripts to help me determine the repository locations by traversing a given part of the file system. Of course, I used Python to do that work.

The gist of the script(s) is that it will enumerate every directory it traverses. It then tries to look for "signature" directories and files. By "signature" I mean that these are specifically named and if you see them then you have probably found a Subversion repository which can be backed up. Python made quick work of this task with the os.walk() method, the Set type, and the accompanying set-based math.

I have put all of my code into a python package so that it'd be easy to build and deploy for the various client sites. I also tried to make this code fairly generic so that it could be re-used with other clients. I used Test-Driven-Development while writing these scripts. Of course, the code is under version control on my company's own Subversion repository.

Is it really worth traversing the file system to search for Subversion repositories? Yes! I found dozens of repositories scattered all around. Each has a file system backup performed on its host partition but none were being dumped as they should to guarantee a consistent, restorable backup.

Mercurial
Mercurial is a distributed revision control system. There is no centralized server by design. I didn't find a lot of good information about getting a good backup. Some items pointed to "hg clone". From some reading, it seems that backups consist of the copies kept on developers' machines. Some folks keep a central computer on which they have a Mercurial client and keep a master copy. I decided that I was just going to let the backups stand as is and rely on the file system level backup. I'm not entirely happy about it and may want to revisit this decision as soon as I take care of the other applications that need attention.

Next up...Oracle.

No comments:

Post a Comment