15 October 2009

Shibboleth2 on Ubuntu 9.04

Here are my notes on how I got Shibboleth2 compiled from source on Ubuntu 9.04. YMMV.

Adapted from https://spaces.internet2.edu/display/SHIB2/NativeSPLinuxSourceBuild
  1. apt-get install wget
  2. apt-get install build-essential
  3. apt-get install apache2
  4. apt-get install libxerces-c28 libxerces-c2-dev
  5. apt-get install libxml-security-c14 libxml-security-c-dev
  6. apt-get install libcurl4-openssl-dev
  7. apt-get install libxmltooling1 libxmltooling-dev
  8. apt-get install libsaml2 libsaml2-dev
  9. download log4shib source from http://shibboleth.internet2.edu/downloads/log4shib/latest/
  10. ./configure --disable-static --disable-doxygen --prefix=/opt/shibboleth-sp
  11. make
  12. make install
  13. download XMLTooling-C source from http://shibboleth.internet2.edu/downloads/opensaml/cpp/latest/
  14. ./configure --with-log4shib=/opt/shibboleth-sp --prefix=/opt/shibboleth-sp -C
  15. make
  16. make install
  17. download OpenSAML-C source from http://shibboleth.internet2.edu/downloads/opensaml/cpp/latest/
  18. ./configure --with-log4shib=/opt/shibboleth-sp --prefix=/opt/shibboleth-sp -C
  19. make
  20. make install
  21. download shibboleth2 source from http://shibboleth.internet2.edu/downloads/shibboleth/cppsp/latest/
  22. ./configure --with-log4shib=/opt/shibboleth-sp --prefix=/opt/shibboleth-sp
  23. make
  24. make install

04 August 2009

Installing ESX 4 from USB Flash

The target server had a CD drive and I only had a DVD .iso file that was about 835Mb. I discovered that the server's BIOS would allow me to boot from a USB flash drive. I used unetbootin to burn the .iso file to my USB drive. Next, and this is the important part, I booted from the USB drive and pressed the Tab key when presented with the boot options for ESX. That then brings up the standard line of options to pass into the kernel. I appended askmedia to that line. The askmedia option allows one to specify that the install media is hosted and available elsewhere, for example, HTTP, FTP, NFS, or USB. Of course, I selected USB when prompted and the rest of the install proceeded automatically with no problems.

03 August 2009

stackoverflow.com is the best thing since sliced bread

I love stackoverflow.com. If you do any programming at all you must check it out. I had a really tough programming bug last Friday and posted a question on stackoverflow with the appropriate tags to help with classifying the questions. Within an hour I had two excellent responses one of which was from Python luminary, Alex Martelli. (His answer turned out to be correct -- no surpise there.)

A feature that I particularly like about stackoverflow is that I can use my OpenID credentials to authenticate. Beyond that, I can link my accounts to their sister site at serverfault.com.

These sites are always the ones that I look at first now for answers even before google! Well done!

13 July 2009

IPTables protection against brute SSH attacks

One annoying thing I see in my servers' logs for which their respective ssh port is not restricted is that there are always brute force attacks every day. For various reasons some of the servers that I administer have to have wide open ssh ports. I found two sites today that show how to use IPTable's "recent" module to slow down those brute force attacks. It works great!

11 February 2009

Python multiprocessing vs. threading performance

I recently wrote an application using the threading module in the Python standard library. The application itself was basically attempting to discover Open Reading Frames (ORFs) in a DNA sequence. The application appeared to be mostly CPU bound.

Running the application in a single thread took about 6 seconds for my test data. Running it continuously over 3 threads took about 30 seconds per run! The more threads added, the slower it ran on average. That is actually what I expected because of Python's Global Interpreter Lock (GIL). I decided to look at the multiprocessing module to see if I could get the average run time back down to 6 seconds.

Here's the result running the same data over 3 threads using the threading module versus the same setup but with 3 processes using the multiprocessing module.

Threading Data

Thread-3 took 22.3910000324 seconds
Thread-1 took 23.2190001011 seconds
Thread-2 took 38.8129999638 seconds
Thread-3 took 24.7969999313 seconds
Thread-1 took 26.375 seconds
Thread-2 took 35.2030000687 seconds
Thread-3 took 30.1089999676 seconds
Thread-1 took 29.375 seconds
Thread-1 took 24.109000206 seconds
Thread-3 took 26.5 seconds
Thread-2 took 36.0160000324 seconds
Thread-1 took 29.390999794 seconds
Thread-3 took 30.6720001698 seconds
Thread-2 took 32.5779998302 seconds
Thread-1 took 31.25 seconds
Thread-3 took 30.8439998627 seconds
Thread-2 took 32.0150001049 seconds
Thread-1 took 30.9220001698 seconds
Thread-3 took 30.6089999676 seconds
Thread-2 took 23.125 seconds

AVERAGE = 29.4 seconds


Multiprocessing Data
OrfDetection-2 took 6.65599989891 seconds
OrfDetection-1 took 12.4379999638 seconds
OrfDetection-3 took 12.4530000687 seconds
OrfDetection-2 took 6.43799996376 seconds
OrfDetection-2 took 6.375 seconds
OrfDetection-1 took 12.3589999676 seconds
OrfDetection-3 took 12.4070000648 seconds
OrfDetection-2 took 6.39099979401 seconds
OrfDetection-2 took 6.35900020599 seconds
OrfDetection-1 took 12.3280000687 seconds
OrfDetection-3 took 12.4059998989 seconds
OrfDetection-2 took 6.45399999619 seconds
OrfDetection-2 took 6.3900001049 seconds
OrfDetection-1 took 12.25 seconds
OrfDetection-3 took 12.2660000324 seconds
OrfDetection-2 took 6.43799996376 seconds
OrfDetection-2 took 6.42199993134 seconds
OrfDetection-1 took 12.3439998627 seconds
OrfDetection-3 took 12.2650001049 seconds
OrfDetection-2 took 6.15600013733 seconds
AVERAGE = 9.4 seconds

Besides the faster average run times, one other difference between the two implementations was that the application using threading tended to run at about 40% of CPU whereas the one using multiprocessing ran at 100% of CPU (each python process took about 33%).

I just now noticed that in the multiprocessing implementation, OrfDetection-2 always took around 6 seconds whereas OrfDetection-1 and OrfDetection-3 always took around 12 seconds or twice as long. Hmmmm. Wonder what that means. I'll have to investigate that further. I expected each to run in around 6 seconds.

09 January 2009

Database Trigger vs. ActiveRecord Callback

Out of curiosity, I wanted to compare the performance of an ActiveRecord "after_save" callback versus a PostgreSQL "AFTER INSERT" trigger.

I prototyped the functionality in Rails. I got the logic clean and simple. Next, I ported that logic to PL/pgSQL. I wrote a few time-related functions to keep the code clean but it was identical in flow and logic to the Rails code.

Lastly, I ran the the callback and trigger forms of the business logic.
The measurement here is the time it took for the POST action to complete. I know that it is imperfect but, in this case, its a good proxy because, ultimately, the point of this is to enhance responsiveness to the end-user. All other things are equal except for how this one chunk of business logic is implemented.
  • ActiveRecord callback- 181750ms
  • Database Trigger - 93729ms
So, that works out to a 93% decrease in execution time of the Ruby on Rails action when implemented with a database trigger. Honestly, for no particular reason I expected the trigger form of the logic to blow away the ActiveRecord form. 93% is whopping but I was unrealistically expecting something much faster (illogical, I know).