robjsoftware.info

A blog about software – researching it, developing it, and contemplating its future.

Archive for 2008

The Five Stages of Programming

with one comment

Programming is an interesting job, because it goes in continual cycles. Each part of the cycle has its own emotion that goes with it, too.

When starting off a new project, there’s a learning curve that goes with it. You’re spinning up, reading code, reading technical papers, trying to figure out what the hell you’re going to do. The main emotion here is puzzlement — how is this thing going to work? What’s the interface? What’s the feature set? What the heck is going ON?

After that comes early implementation. In this phase, the main emotion is nervousness. You think you know how it’s going to work, but there’s nothing really there yet. So you’re hacking madly away, trying to get enough of a skeleton in place that you can start to make it dance. Forget about getting flesh on the bones, you’re just trying to come up with something that can stand up! Since you don’t really know what you’re doing yet, it could all still fall apart on you.

Once you’re out of those woods, you’re into late implementation. Here, the main emotion is adrenalin. You’re charging on all cylinders, driving at full throttle. The bones are rapidly becoming enfleshinated, and you’re in the zone. This is in some ways the most satisfying part of the whole cycle, because now you start to see some real results from what you’ve been working on.

The last phase is debugging. Here, the emotion swings wildly between frustration and relief. You’re almost done… except you’re not! There’s a bug! Fix it, fast! OK… and on to the next test… and WHAMO, another weird bug! Grrrr. OK, got that one done… YES! IT WORKS!!! Ship it!

And then the whole cycle starts over again.

So that’s my job: puzzlement, nervousness, adrenalin, frustration, and relief. Of course sometimes you take a few steps back. For example, right now I made it all the way to relief, but I’m about to backslide into nervousness. The best-case scenario, though, is when you make it to relief and then you can keep building on the code you just finished… then you have a kind of secure happy foundation under you, reassuring you that even if your current layer falls to bits in a welter of recrimination, at least you know the relief — that fantastic sense of accomplishment that comes with writing a software machine from thin air, that has real value and usefulness — is still out there, in the future, waiting for you.

That’s what software is, to me: the promise of progress, of building on what’s come before, making it better. And this emotional cycle is what it takes to make that happen. So I’ll close with a word that sums it all up for me:

Onwards!

Advertisements

Written by robjellinghaus

2008/12/30 at 05:03

Posted in Uncategorized

Yow! MGrammar ahoy

leave a comment »

I’ve ranted about grammarware here in the past, but now I’m actually using some, and it’s rocking. I’m talking about MGrammar, part of the impending Oslo.

MGrammar is a toolset for writing grammars and working with them. The MGrammar grammar description language lets you write what is basically an LALR(1) grammar, which means it’s about as powerful as YACC. Or it would be, if it wasn’t also a GLR parser, which means that when your language is ambiguous you get all the possible alternatives. This is a nice way of avoiding the NP-completeness issues around ambiguity detection.

MGrammar also separates syntax (LALR(1)) from tokenizing (regular). This is a win, since sometimes tokenizing is all you need.

The coolest thing about MGrammar is the grammar development tool. It’s a three-paned window, with sample text on the left, the grammar in the middle, and the parse tree on the right. You can change the sample or the grammar at will, and it reparses with every keystroke. When your grammar finds a parse error or ambiguity in the sample text, the sample text gets a red underline with Intellisense. When your grammar itself has an error in it, you also get a red underline with Intellisense, which is reasonable because there is in fact an MGrammar description of MGrammar itself, which drives the Intellisense.

It works very well in practice and makes grammar writing so much more productive it’s not even funny. Using a lot of negation can make it go nonlinear, but it’s manageable.

On a sad note, Gilad and his merry band have lost their funding. This really sucks, as Newspeak is one of the most interesting language efforts around. I very much hope a community takes root around it and drives it. (If it can happen to Factor, of all things, it can happen to Newspeak.)

And finally, yes, http://robjsoftware.org was broken for the last month or two. I had one email filtering rule for all of GoDaddy, and when we moved to Seattle this spring, the address change broke GoDaddy’s monthly charging, but I didn’t see the warning because it was in the same bucket with all of GoDaddy’s spam^H^H^H^Hinformative emails. Which I never read. Sigh. I clearly need a fully redundant alerting and monitoring system for my blog.

Written by robjellinghaus

2008/11/25 at 04:05

Posted in Uncategorized

Life, it is the greatest

leave a comment »

Well, except for Blogger eating my profile picture, and GoDaddy eating my redirect from robjsoftware.org to blog.robjsoftware.org — why can’t things just Keep On Working? Entropy, I hates it.

Shortly after I griped last month about a lack of working FLINQ samples in the latest F# CTP, Don Syme himself came through nicely with just what I asked for. So yay Don! And yay F#! And boo me, because I have not done thing one on the personal-hacking front in the last month. In fact, that aspect of this blog is going to go quite dark, if current evidence is anything to go by.

Not to say I’m not still deeply digging geeky things — I’m currently reading my way through the extremely excellent Parsing Techniques, The Second Edition. My upcoming task at work is to do a whole lot of parsing stuff, and this is exactly the book I need. It’s amazing. I’ve been reading scads of parsing papers (one-stop shop for me: Bryan Ford’s Parsing Expression Grammars page), but I lacked the basic background — what exactly is LALR? How do shift-reduce parsers work? How do you convert a left-recursive grammar to a non-left-recursive grammar, and what does it do to your attributes? Well, the Parsing Techniques book is absolutely the best imaginable text for me. It’s the second edition, just published this year; the first edition was from 1990. How beautifully synchronistic that it should come out just when I absolutely vitally need it! I LOVE it when that happens.

And honestly, there are two other reasons I’m not getting much solo hacking done. One is that I’m climbing about 20 learning curves at once in my day job, and it’s saturating my technical novelty bandwidth. There’s not a lot of extra juice right now for doing yet further explorations in the evening. The other reason, and this is something I have yet to blog about here, is that it’s the fall season, and that means GAMES.

Yes, the truth is out: I’m a fairly inveterate computer/video gamer. It’s been a hobby of mine ever since I first laid eyes on a computer — literally; the first computer I ever saw was a PLATO timesharing system at my best friend’s doctor father’s medical school in Connecticut. And what was it running? Spacewar. I still remember it vividly.

Ever since then I’ve been happily gaming away, and in many ways it’s the perfect hobby for a compulsive hacker — video games push technology in a lot of ways, and modern games use cutting-edge 3D graphics, physics simulation, distributed virtual space technology, and generally a whole lot of hardcore computer science in doing what they do. So not only do the games themselves get more immersive as games, but they also get more technically interesting and intriguing to learn about. Right now I’m playing Crysis, one of the most hardware-intensive games ever made (though people debate whether that’s because it’s not well optimized or just super ambitious). I finally got my self-built PC to run two graphics cards (through NVidia SLI), and man, this game is freaking stunning on my 1920×1200 26″ monitor. (Which cost only $600! Damn, wasn’t it just two years ago that this sort of thing was $2000+?)

So I’m giving myself permission to slack off, personal-hacking-wise, for the rest of the year. Unfortunately it looks like it will still be a good long time before I can post in depth about what I’m actually working on at Microsoft, but suffice to say that I really do look forward to that, and it will happen sooner or later, and the longer it takes the more I’ll have to say when the veil finally drops. But rest assured, it’s freaking cool and you will love it when you see it 🙂

Written by robjellinghaus

2008/10/09 at 04:49

Posted in Uncategorized

Best laid plans, mice, men, etc.

leave a comment »

Well, not a lot of F# hacking got done last month. I did download the then-current F# build, and tried out some of the FLinq samples, and they didn’t work. I posted about it on the F# mailing list (Microsoft, your mailing list server needs some serious kicks in the pants region), and there was no helpfulness forthcoming. So, onto the (cold, dark) back burner it went.

Which was fine, because ordinary life (summer vacation for the kids, my birthday, etc.) was plenty busy. And my work has exploded into a drinking-from-the-firehose geek frenzy — I wrote a monadic push parser the other day, and got paid for it. Not clear whether it’ll ever ship, but it was definitely relevant, which rocks. I can now say that I have lost my monadic virginity. Whether I should say it is another question. (My wife says not….)

However, on the bright side, the F# team did ship their CTP (Community Tech Preview, or something like that, for those outside the MAZ (Microsoft Acronym Zone)). So I’ll take another run at it later this month. One cool thing is they have support for dimensional quantities now, a la Fortress. Only F# is basically here now, and usable for production software, whereas Fortress is still N years away from having any kind of realistic compiler. So IN YOUR FACE, Guy Steele! (Seriously, Fortress looks great. It’s just that F# is here now, and is pretty great itself.)

OK, time to get back to watching a truly monster build crunch away. Work is creeping out of normal working hours — getting assigned a fairly major team-wide task tends to have that effect. It’s also cutting into my discretionary hacking and gaming time considerably. We’ll see what the next month holds….

Written by robjellinghaus

2008/09/04 at 04:33

Posted in Uncategorized

Not Dark, Just Busy

leave a comment »

It’s amazing what a difference not commuting can make. Up until our move to Seattle in April, I’d been commuting an hour each way every day into San Francisco. That time on BART — without net access — was where all my blogging got done. Well, now I’m driving 15 minutes each way to work (modulo time to take my daughter to her school), and still coming down off the stress of moving, so I’ve been putting my energy elsewhere in the evenings. In any case, consider this a summer vacation for your RSS feed 🙂

Not that I’ve personally been vacationing — not at all! I’m finally getting up to some kind of reasonable cruising speed at work. It’s been a colossal re-education in .NET, C#, low-level Windows debugging, SQL data stack design, and about 50 different interesting research projects which I can’t really discuss — at least not yet. It’s very educational being on the list for all the different talks that happen inside Microsoft Research; there are a lot of different pies that those folks have their fingers in. The internal Windows developer talks are also very intriguing.

Now, this isn’t to say that everything is on the hush-hush. Some of the better publicly visible tidbits I’ve run across lately involve LINQ, the techniques for basically doing data-oriented metaprogramming in the .NET framework. I mentioned LINQ in my last post, but suffice to say that it’s a combination of a set of data-oriented query operators (that form a composable, monadic algebra), some syntactic extensions to allow more idiomatic SQL-like query expressions, and an internal syntax-tree representation that can be transformed into native query languages (including but not limited to SQL) and partially evaluated into high-performance serializers. Overall it’s a very well-thought-out structure with a lot of room for growth. To wit:

  • DryadLINQ extends the LINQ operators to support MapReduce-like distributed cluster computation at scale.
  • PLINQ and the ParallelFX framework are building up the .NET infrastructure for fine-grained parallelism and efficient use of multi-core for data-parallel problems.
  • Matt Warren’s series on implementing a LINQ query provider is a great example of metaprogramming in practice — taking a C# abstract syntax tree describing a query, and transforming it into efficient SQL. This is the technique that is at the heart of LINQ. I’ve heard tell of F# projects that are using LINQ-like expressions to describe linear programming constraints declaratively — same metaprogramming paradigm, completely different domain.

All of this is exciting because you noble long-term readers (e.g. since December 2007) will know how interesting metaprogramming is to me. Microsoft really gets it on multiple levels, and has put it into production with LINQ. There’s a lot more on the horizon, too, and I’m eagerly waiting for each bit of it to reach RTM in some form so I can happily blog away about it!

Not only that, but I have a personal hacking project again. Matt Warren’s blogs (mentioned above) are a great example of how to implement metaprogramming with an imperative C# visitor-style API. But I find the code hard to read — it’s taking this lovely declarative representation and then transforming it with all of this intensely imperative, explicitly scheduled transformation code. It reminds me of the JastAdd project, which has evidently now reached public release. JastAdd creates a declarative framework for implementing language analyses and transformations. I want to hack on an F# implementation of the JastAdd paradigm, applied to LINQ to SQL transformations. It would be very interesting to see if it could lead to something significantly easier to maintain and reason about.

This is something that arguably is potentially relevant to work. So I am going to blog about it internally first, and then repost publicly a week or so later (on a per-post basis). If it gets to where it is interesting enough for Microsoft to want to privatize it, I’ll just have to find something else to blog publicly about! In any case, it’ll be fun to post about here for as long as it lasts 🙂

Written by robjellinghaus

2008/08/03 at 21:04

There’s Nothing Micro about Microsoft

leave a comment »

Hello everyone. Ah me. We live in Washington now! (State, that is.) We’re happily ensconced in a nice rental home in Kirkland, and so far just about everything we hoped for from the move has happened — we have less commute, cooler weather, a better school for our daughter, and my wife no longer has to work. And my new job at Microsoft is going very well so far, though I’ve actually been there less than a month — I took May off to help our family settle in, and THANK GOODNESS, because moving is a LOT OF WORK even once you actually arrive!

But so far so good up here… I’m sitting in our living room looking out at the evening sky and the pine trees across the street, through the full-length windows and French door that front our house. It’s beautiful up here.

I can’t say too much in detail about what I’m working on, because Microsoft (like Google) is mighty touchy about confidential projects. But I can make some general observations after being back on the inside of the Borg for a few weeks. (I was amused to discover that they remembered me! I interned at Microsoft in 1988 and 1989, and I guess they gave me employee number 40775, because when I signed back up they gave me the same number back again. People were like, why are you 40775 rather than 263484? Microsoft’s gotten BIG over the years….)

It’s very curious how deep the not-invented-here goes at Microsoft. It seems to be partly historical — Microsoft was such a winner-take-all company for all of the eighties and nineties, it sank deep into the marrow of the company. And it is partly reactionary — the rest of the industry reacted so negatively to that aggressive attitude (antitrust suits, pitched legal battles with Apple and Sun, etc.) that it drove Microsoft even further into its own corner. I’ve got kids now, and I think a fair bit about sibling rivalry, and how kids (and adults, and nations) tend to define themselves in opposition to one another… sometimes your identity emerges through your interactions with your peers. That’s definitely happened to Microsoft, and as a more than thirty-year-old company, it’s going to change only slowly if at all.

Regardless, Microsoft really is its own technological world now. And this has its good points and bad points. Coming from the Java world, and from California where lots of my friends are vehemently anti-Microsoft, it’s a bit bemusing to see it all with an outsider’s eyes… plenty of my new colleagues have been here for decades, which is almost unimaginable to me. It’s certainly part of my value here, that I’ve got recent experience with how things are on the Outside.

I can’t be too specific, but there are quite a few areas where I feel like Microsoft’s internal technical isolation is hindering them… particular tools that seem like a step backwards, or particular design problems where it seems like there just aren’t quite enough people providing fresh ideas. The relative isolation of the Microsoft software stack can seem a bit… I don’t know… lonely? The “us vs. them” thinking is hard to escape in the blogosphere, and it’s such a polar choice — either you’re on Windows / .NET, or you’re not. And if you are, you’ve got to pay to play — at my last startup, we were a Linux and Java shop, partly because it got the job done and partly because it was free. (Though as one of my new cronies says, the people who won’t pay aren’t customers anyway, because how can you make a business out of non-paying customers? That’s a very deep-seated belief in Microsoft-land, and you know, there’s some truth to it.)

But on the flip side, there are some real advantages to owning all the code you could possibly need to run an entire industry of PCs. I’ve spent the last two months spinning up on LINQ, one of the coolest new features in C# 3.0. It stands for Language Integrated Query, and on the face of it it seems like syntactic sugar to let you write SQL-like code in C#. But it turns out that under the hood there’s a lot more to it — it’s implemented via compiling language statements into expression data structures, that can then be rewritten, reparsed, and used to generate entirely different kinds of language output. It is very cool technology, very useful for creating domain-specific languages — in fact, it’s rather along the lines of my extensible language rant from a few months ago.

And it would not have been possible if Microsoft didn’t completely own the C# and Visual Basic languages, and have the resources to come out with a new iteration of the language spec, and all the compilers and tools to support it, simply because they thought it was a good idea. Compared to the slowness of Java’s evolution (how long has the closure spec been rattling around?), Microsoft’s ownership is yielding real benefits to .NET programmers. (OK, so the closure spec is deeper and wider-reaching than C#’s lambda expressions, but nonetheless there are several intersecting features in C# 3.0 that are all needed to make LINQ work, and I don’t see Java catching up very quickly.)

It’s also pretty amazing to see the breadth of the expertise here — my team happens to be pretty closely connected to Microsoft Research, which is teeming with world-class experts. If you look at the roster (Simon Peyton Jones, Don Syme, Erik Meijer, Galen Hunt, Nick Benton, Martin Abadi, Luca Cardelli… heck, search ’em yourself!), you’ll see a whole lot of people who’ve driven the world of software forwards. Microsoft has a deep commitment to that goal, even if their not-invented-here, no-open-source mentality gets in the way sometimes. So it’s exhilarating to be part of that mission.

Microsoft is a colossal company, and I’m fortunate that I’ve landed in a very ambitious and solidly supported team — in fact, I can’t think of any job I’d rather have in the industry. I’m feeling very lucky indeed, and I’m doing my best to get productive quickly — this opportunity isn’t going to come along again anytime soon!

And, that said, I’m obviously not doing very well on keeping to my blog schedule. Realistically this blog is going to slow down a bit, probably to more like once per month. Eventually — once our team’s incubation project goes public (knock on wood!) — I’ll hopefully have another blog on msdn.com where I’ll blog semi-officially about our technology. But this blog will be my personal property into the indefinite future. Stay tuned!

Written by robjellinghaus

2008/06/24 at 04:26

Posted in Uncategorized

Only 20,000 Lines

leave a comment »

A while back I posted a big ol’ post titled A Growable Languge Manifesto which argued strongly for extensible languages.

Well, I just ran across the one-year progress report from Alan Kay‘s current research group, and it’s some extremely exciting work that is all about extensible languages!

The group is the Viewpoints Research Institute, and the progress report lays out their plan to implement a complete software stack — everything from the graphics driver, to the language interpreter / compiler, to the TCP/IP stack, to the windowing and rendering system, to the IDE and programming environment — in 20,000 lines of code. Total.

As they point out, a TCP/IP stack alone in many conventional languages is more than 20,000 lines. So how can they possibly pull it off?

The answer, it turns out, is extensible languages. Specifically, they have an extensible parsing system — OMeta, cited heavily in my manifesto — which allows them to easily and quickly extend their languages. They also have a “meta-meta-language runtime and parametric compiler” named IS, which is how they actually get their languages and metalanguages into executable form.

One especially cool example is their TCP/IP stack. The TCP/IP specification has text diagrams of packet formats. So they wrote a grammar to parse those specifications directly as ASCII. And lo and behold, they could use the TCP/IP RFCs themselves to generate their source code. They also can use their parsing framework to analyze the structure of TCP/IP messages — they basically define a grammar for parsing TCP/IP messages, and action rules for handling the various cases. (OMeta lets executable code be attached to matching productions in a grammar.)

They also wrote domain-specific languages for just about every area. One example is low-level pixel compositing, basically giving them the functionality, and most of the efficiency, of a generative 2D pixel processing library such as Adobe’s Generic Image Library, cited in Bourdev and Jaaki’s LCSD 2006 paper. Another example is polygon rendering (450 lines of code that implements anti-aliased rasterization, alpha compositing, line and Bezier curve rendering, coordinate transformations, culling, and clipping). Though evidently they have yet to fully define a custom language for polygon rendering, and they hope to cut those 450 lines by “an order of magnitude”.

Basically, they take almost all parsing and optimization problems and express them directly in their extensible language, which gives them almost optimal flexibility for building every part of the system in the most “language-centric” way possible.

They have no static typing at all, which doesn’t work for me (though their line counts make a compelling argument), but there’s no reason in principle that these techniques couldn’t also apply to static type system construction.

In fact, there is work right now on intuitive language environments for creating written formal definitions of programming languages. A system like Ott lets you write semantics declarations that look like they came straight out of a POPL paper, and convert them into TeX (for printing) or Coq/Isabelle/HOL (for automated verification). I don’t know how far the implementation of Ott or Coq/Isabelle/HOL would change if Viewpoint’s techniques were aggressively applied, but I look forward to finding out!

I think this kind of programming is the wave of the future. Reducing lines of code has always been an excellent way to improve your software, and the expressiveness of a language has always shaped how succinctly you can write your code. From that perspective, it seems obvious that an extensible language would provide maximum potential increase in expressiveness, since you can tune your language to the details of your specific problem. It’s a force multiplier, or possibly a force exponentiator.

Object-oriented class hierarchies, query languages, XML schemas, document structures, network protocols, display lists, parse trees… they all share a common meta-structure that the “extensible languages” concept subsumes, and the Viewpoint project is the clearest evidence I’ve seen of that yet. It’s going to be a dizzyingly exciting next few years in the programming language world, and programming might just get a lot more interesting soon — for everybody.

There’s more to say on that topic, but I’ll save it for another post!

Written by robjellinghaus

2008/04/15 at 04:37