27 August 2008

Shelving AMQP for now

I recently put a lot of effort into working with the relatively new messaging queue protocol, AMQP. This involved reading the v. 0.8 spec., source code of some implementations, the RabbitMQ docs and boards, and the source code of the py-amqplib module.

Ultimately, I could not get the AMQP client and broker to behave in ways that I expected. This is a real disappointment because I had high hopes for it in future projects that will need to be scalable. I'll give it another 6 months to a year and then see where things are.

26 August 2008

Python logging.fileConfig() weirdness

I wrote a draft of a python application that I intended to turn into a Windows service. In it, I used the logging standard module and the fileConfig() function to configure up multiple log handlers and other cool logging related things. All of that worked great when run as a simple script on a Windows box.

When I wrapped up the functionality in a Windows service, however, the logging killed the service almost immediately with a "Bad File Descriptor" IOError exception being raised. I tried quite a lot of things to get it to work again but to no avail.

In the end, I replaced logging.fileConfig() with logging.basicConfig() and lo and behold it worked again. I don't know why it worked. Searching with Google yielded a few things about "atexit" on Windows or some such peculiarity. I don't care because my app. works again.

I'm pretty steamed with the logging module for wasting my afternoon. (I'm actually grateful to have such a wonderful module in the standard library at all.)

12 August 2008

Backups Reorganization pt. 8: Oracle

I was saving the Oracle database backup for last because I felt that it would be the most difficult to do right. As it turns out, the client's chief Oracle user and DBA has already done most of the work. He is dumping the databases that he cares about to a server and directory which I am already backing up. I need to ask him to try do do a test restore though. That's pretty key.

In the meantime, I learned about another server that needs to be backed up but which isn't. The only trick is that that particular server is behind a corporate firewall so my backups servers can't initiate contact. I'm going to have to have the target server initiate a push of data.

Ultimately, that's the model that I want for all backups. That is, a client resides on the target machine which polls a central server for instructions and then, if allowed, makes the connection to the backups server and sends the data.

11 August 2008

Backups Reorganization pt. 7: Zope

The client has three versions of Zope running on three servers. I took the work that I did to discover and dump the Subversion repositories and retrofitted that for Zope databases. It wasn't too difficult and I simply extended an existing class in the package that I am building. Writing the test cases took the most time, as is often the case.

The only thing that was a little different was that I needed to make use of a configuration file in which to put some non-shared setup between the three possible Zope instances. This is easy to do with the ConfigParser class in the Python standard library.

05 August 2008

CCP4 on RHEL5

CCP4 is an open-source, scientific application used in crystallography. It is written in C++ and Fortran and runs best on Linux.

A client needed help today in compiling this application on Red Hat Enterprise Linux (RHEL) 5. This blog posting is intended to provide a brief outline of getting CCP4 running on that platform.

Here's what not to try:
  1. download source
  2. use yum to get all of your prerequisites such as an old gcc compiler for c++ and fortran
  3. source the includes/ccp4.setup file after editing some of its variables
  4. set the CC and CXX environment variables to /usr/bin/gcc34
  5. ./configure linux
  6. make
That is guaranteed to fail. It blows up at the point that it is compiling the MMDB for the Clipper subcomponent. Interestingly, telling the configure script to disable clipper has no effect.

One requirement of CCP4 is that it needs Tcl, Tk, and BLT. Tcl and Tk are not well supported in RHEL5 and BLT is non-existent. By not "well supported" I mean that the tk-devel and tcl-devel rpm packages do not exist from the main Red Hat repositories.

To get around this show-stopper, I turned to my CentOS 5.2 box. CentOS is a clone of RHEL. It has a few packages that RHEL does not, especially, the Tcl and Tk libraries and header files.

Here's the recipe of what actually worked:

Build Tk/Tcl/BLT
  1. on a CentOS 5 box, use yum to install tk-devel and tcl-devel
  2. download the Tk/Tcl/BLT tarball from the CCP4 site linked above
  3. unpack the tarball and cd into it
  4. run ./configure (perhaps with --prefix=_____)
  5. note carefully the directories that it says that it will write to
  6. make; make install
  7. create a tar file with the files and locations listed in the configure output
  8. copy the resultant tar file to the RHEL box
  9. extract the files from the tar file into the same directory locations and overwrite, if necessary
Install CCP4
  1. The trick is not to compile from source but, instead, to download the linux binaries. They claim to be known to compile under the ancient RH 8/9. Grab it anyway.
  2. Unpack the binaries, edit and source the includes/ccp4.setup file, and run the BINARY.setup file.
That's it! CCP4 ought to be running on your RHEL 5 machine by this point.

01 August 2008

Backups Reorganization pt. 6: sysadmin slip-up

Earlier in this project, I described how I condensed the many Logical Volumes into one big one for each server performing remote backups. As part of that process I had to not only carefully rework the LVs but I also had to alter the /etc/fstab file for each partition that was removed. I knew that if I left a partition listed in the /etc/fstab file that didn't actually exist then the machine probably would not come back up after a reboot.

Well, that's exactly what happened with one of the four machines. I decided to reboot each machine since each had over 510 days of uptime. One did not come back on line. I had to travel on-site to the secured location to gain access. (Gaining physical access was tricky, as it should be. You don't want to let just anyone walk into your data center!)

Once I got a monitor and keyboard on the target server I saw that it had indeed tried to run fsck against a partition which was not mounted because it no longer existed. It was waiting to for user input to go into maintenance mode. The fix was easy -- simply remove the offending line from /etc/fstab.