17 September 2008

Python popen deadlocks

I have a python script that calls out to an external program using os.popen3(). That function returns file handle like objects for stdin, stdout, and stderr. I need to examine both stdout and stderr separately.

After running this script for a while I started to see it hang at times. This was especially true after I added in a parameter which increased the amount stderr. It turns out that there is a well known deadlock issue with the popen family of commands. Or, rather, it was not known to me until yesterday.

I started to tinker in my script using select.select() as a solution. That seemed a little foreign to my way of thinking so I changed directions and used separate threads for reading from the stderr and stdout file handles that os.popen2() returns. That seemed to do the trick so I'm pretty happy about that.

11 September 2008

Backups Reorganization pt. 10: Verification

I set up automated verification on the 4 backups servers on every backup job that they run. That simply involved the correct usage of the --verify option to rdiff-backup. This option calculates SHA1 checksums on files in the backups and compares them to the backups metadata. I don't put much stock in this process but feel that it is necessary to actually perform. Besides, the overhead is pretty low since the processing stays entirely on the backups server.

Rdiff-backup also sports a --compare-hash option in addition to --verify. The --compare-hash option actually calculates SHA1 checksums on the source server to compare to what is in the backups metadata. That seems nice but is probably going to be CPU intensive on the source server which I don't want. Still, I might just set it up as a weekly process to run in off hours. We'll see.

One thing that I learned is that Python < 2.4 doesn't support the decorator syntax in 2.4 and up. Rather than monkey with the small syntax differences and determining at runtime which Python version was executing the script, I just decided not to use decorators in one place where it would have been nice (but not necessary) to do so.

The next piece of this project will be to automate the test restores. That'll be slightly more tricky but ought to be satisfying to actually program up.

03 September 2008

Backups Reorganization pt. 9: Retrospect

I finally got around to addressing the backup of the one server that resides behind the firewall. I can't use the approach I had been using, that is, having the backups server initiate the backup because the backups server is itself outside of the firewall. The simplest thing to do is to backup the target server from within the firewall. As it turns out, my company has a couple of dozen terabytes of backup space inside the firewall all controlled by the Retrospect backup software. Setting up the job on the Retrospect server was easy and the initial backup of the target server is running now.