OOPSLA 2007 – Languages Gone Wild

It has come to my attention that my writing here has been a bit boring. Dry. Stuffy. Well, Kevin Bourrillion has shown me the way. Time to liven it up a bit. Just a bit, mind you. Juuuuuuuuust a bit.

And please don’t give me trouble over my inability to post more than once a week. Really all I can say is ARRRRGH! Work, upcoming conferences, and raising two very young kids = JEEZ WHEN DO PEOPLE EVER GET TIME TO BLOG? And while we’re on that topic, I don’t see you blogging enough either, do I? Don’t be throwing any stones, Mr. Glass House.

OK, where were we? Ah yes, OOPSLA 2007. What a beautiful conference. Not that I attended or anything, but who needs to attend a conference when you have the Internet?

The dynamic vs. static languages flame war has bogged down the language community over the last decade. It’s great to see that, judging by some recent papers from the latest OOPSLA, the logjam has broken with a vengeance. There are all kinds of brain-bending new languages in the works, and it’s frankly exhilarating. (Sorry that some of these are only available through the ACM’s Digital Library… but at $99 per year, how can you afford NOT to be a member???)

First we have a great paper on RPython, a project which creates a bimodal version of Python. You can run your Python program in a fully interpreted way in a startup metaprogramming phase, and then the RPython compiler kicks in, infers static types for your entire metaprogrammed Python code base, and compiles to the CLR or to the JVM with a modest performance increase of EIGHT HUNDRED TIMES. Yes, folks, they’ve run benchmarks of Python apps that run 800x faster with their system than with IronPython (which is no slouch of a Python implementation AFAIK). If that isn’t a great example of how you can have your dynamic cake and eat it statically, I don’t know what is.

Another lovely system is OMeta, a pattern language for describing grammars. You can write everything from a tokenizer up to a full language in a really nice executable grammar structure, with productions that map directly to some underlying base language. They also have a good modularity story worked out, and support for stateful parsing. They have a 177-line Javascript parser, and that’s not much!

Then there’s an equally great paper on JastAdd, an extensible compiler for Java. The JastAdd compiler is built around an extensible abstract syntax tree. The abstract syntax tree is the ONLY data structure in the compiler — there are no separate symbol tables or binding tables; everything is implemented as extensions to the abstract syntax tree. The extensions are expressed with a declarative language that lets you define dataflow equations relating different elements in the tree — inherited elements (for referring to names bound in a parent construct, for example), or reference elements (for referring to remote type declarations, for example).

The compiler has an equation analysis engine that can process all these equations until it reaches a fixpoint, which completely avoids all the usual multi-phase scheduling hassles in compilers around interleaving type analysis with type inference, etc. It seems like The Right Thing on a number of levels, and it makes me want to hack around with building a compiler along similar declarative lines. They give examples of extending Java with non-null types and of implementing Java 5 generics purely as a declarative compiler extension. That, to me, pretty much proves their point. Bodacious! I had been thinking that executable grammars were a nice way to go, but seeing their declarative framework’s power is seriously making me reconsider that idea. What would you get if you combined OMeta and JastAdd? Something beautiful. I’m not sure how you’d combine the statefulness of OMeta with the declarativeness of JastAdd, but we must ponder it deeply, because the One True AST is a goal worth seeking.

A truly mind-bending paper discusses breaking the two-level barrier. What’s the two-level barrier? Simple: it’s the class/object distinction. They point out that many kinds of modeling can’t be done with a class hierarchy. What you really want is a programmer-accessible metaclass hierarchy. (And not a weenie one like Smalltalk’s, either.) For example, consider an online store. You thought you knew everything about online stores? THINK AGAIN, JACKSON. Let’s say you have a DVD for sale, such as Titanic. That Titanic DVD is an instance of the DVD product class. The DVD product class is conceptually a further instance of the DigitalMedia product class. I meant exactly what I said there — in their framework, one class can be an instance of another class.

You can then state that the DigitalMedia metaclass defines a “categoryName” and a “net price”, requiring that “category name” be defined by instances of DigitalMedia, and that “net price” be defined by instances of instances of DigitalMedia.. The DVD class then defines “categoryName” to be “DVD”, so ALL DVDs have the same category name. And then particular instances of DVD define their actual net prices individually. In this way, you can take the same kinds of “must provide a value for this field” constraints that exist in the class-object hierarchy, and extend it to multiple levels, where grandparent metaclasses can express requirements of their meta-grandchild instances.

(They use the abysmal word “clabject” — ClAss obJECT — to refer to entities that can be used to instantiate objects (like classes), but that ARE themselves instantiated (like objects). I think “clobject” would have been better, or maybe “obclass” or something. “Clabject” just sounds… I don’t know… disturbing. Like some kind of hapless crab that’s filled with techno-malice. But the concept itself is very interesting. I think that having two orthogonal hierarchies — the metaclass hierarchy and the subclass hierarchy — is potentially too confusing for most programmers, including me, but it’s nonetheless really thought-provoking.)

Those are just four of the highlights — I’m only about a third of the way through reading the OOPSLA papers this year — but I think those are the top three when it comes to language design. It’s going to be a great next decade, as the whole static vs. dynamic war gives way to a myriad of bizarre hybrids and mutants, greatly enhancing (and confounding) the lives of hackers everywhere!

Written by robjellinghaus

2007/10/30 at 03:39

Posted in programming languages, research

robjsoftware.info