Tuesday, April 29, 2008

In Which I Reconnect With Ruby Just in Time to Finish My Master's Degree

The fates have conspired to force me to end my graduate school career with a bang. Fortunately, the bang is not the sound of my head exploding, as I feared it would be at the beginning of this semester. I'm taking two classes -- Theory of Complexity (about P vs. NP, BPP, #P, complexity of quantum computing...you know the drill; if you don't know the drill, try The Complexity Petting Zoo for an introduction) and Applied Graph Theory. Each is fascinating in its own right, but the graph theory class has really made an impression on me. So much so that I think if I'd taken it earlier on, I probably would have been tempted to do a thesis in that area. As it is, I'm stuck with my lame default choice of the "coursework-only" option for the Master's degree I'll be finishing come mid-May (the 17th, just in case anyone wants to send a present).

However, to my credit, I did manage to complete a significant semester-long project in graph theory, using Ruby as the implementation language. Why Ruby, and not Java (which I use professionally right now) or Scala, or any of the other possibilities I blather about on this blog?

Well, if you're a seasoned Ruby user you'll know why -- there's not much else (that I'm accustomed to) that's suited to the kind of rapid development/iterability I needed to get this thing done on time. My primary concerns throughout were a) to be able to complete the implementation with enough time left over to write up my results and prepare my in-class presentation, and 2) see concern a) above. Sure, Java's static typing or Scala's functional orientation might have given me some (nominally) safer code, but here, speed of development was of the essence. Advantage: Ruby.

The actual project? It's an implementation of an algorithm due to Hermann Stamm-Wilbrandt of the University of Bonn (Google is your friend...) to draw a maximal planar graph in linear time. Actually, the intent of the core algorithm is to "embed", not "draw" the graph; although I also implemented an optional enhancement to assign x-y coordinates to the vertices of the embedded graph for drawing purposes. Using Ruby, I was able to implement the entire project in less than 500 lines -- including blank lines, comments, debug, and other cruft -- of gloriously unoptimized, un-statically-typed, un-refactored, un-IDE'd-to-within-an-inch-of-its-life Ruby code.

I shudder to think what the LOC count would have been for Java.

Even in Ruby, the bulk of the code was either utilitarian in nature (e.g., a Line class and a Triangle class to help with the geometry of deciding where vertices should lie in the plane; some vector math for seeing if two points are on the same side of a line), or was wrapped up in simple one-to-two-liner convenience methods to support the main objective. This stuff accounted for about 120 lines. And the <500 lines of code also includes my from-scratch graph data structure (typically of the adjacency list variety), along with about 35 lines of code to emit the output in displayable format. The guts of the actual implementation of Stamm-Wilbrandt's embedding algorithm took about 130 lines of Ruby code (including blanks and comments), plus another 70 or so to handle a complicated special case of the optional x-y coordinate assignment.

My only regret is that I don't know Python well enough to have done the entire implementation in that language. I say that primarily because I used the totally awesome NodeBox as my drawing solution. NodeBox is written using PyObjC (Mac only -- sorry Windows/*n[i|u]x users!), and uses Python as its scripting language. My Ruby implementation actually emits a Python script which I then open with NodeBox to draw the final graph. If it were implemented in Python, then it could just be a NodeBox library.

The Good News: Now I have an excellent excuse to learn Python...