Can Inexpressive Tools Keep Us Honest?

04 Dec 2016

Programmers often lament the tower of babel that a modern software stack has become. It’s been a long time since an individual has had any hope of understanding all of the software that is running on their computer. For folks who value simplicity of implementation, this can be a hard thing to come to grips with.

From a pragmatic perspective, we deal with the situation in a number of ways:

  1. Ideally, don’t make it worse, at least any more than we have to to get stuff done.
  2. Keep things modular, so we can get stuff done without having to think about the whole system.
  3. Build ever more powerful tools that let us manage the complexity, and do harder things successfully.

(3) has the serious drawback of exacerbating the issue, but at the same time, good tools can help us build more reliable software in less time. That’s a hard thing to argue with when you’re working on a project with goals other than simplicity.

When discussing the C programming language, I’ve often heard programmers say it “keeps them honest.” C isn’t terribly great for building abstractions, so working with it successfully requires straightforward design and implementation, or so the story goes.

Part of the straighforwardness they discuss is not just of the code itself, but of the runtime and all of the dependencies needed to express it – garbage collection may make it easier to write programs whose source is “simple,” but arguably the whole edifice ends up being much more complex.

Donald Knuth famously uses an assembly language in TAOCP, and part of his rationale is essentially this last point:

One of the principal goals of my books is to show how high-level constructions are actually implemented in machines, not simply to show how they are applied. I explain coroutine linkage, tree structures, random number generation, high-precision arithmetic, radix conversion, packing of data, combinatorial searching, recursion, etc., from the ground up.

In other words, to understand the algorithms you need to understand what they actually do, and so much complexity is implicit in higher level languages.

There’s a lot to dig into there, but in this post I wanted to share an experience I’m having. As I work on the layout DSL that I mentioned last time, I’m trying to be careful to keep the grammar straightforward. I want alternate implementations to be realistic, and this isn’t the case with some languages.

I’m using Haskell to prototype the language. Haskell is one of the nicest languages you could ask for when it comes to writing parsers. The Parsec library is one of the things people relatively new to Haskell tend to rave about. It makes writing parsers a cinch, often even for complicated context-sensitive grammars.

So I have to be careful.

I don’t want the language to require such a powerful tool to parse. Part of how I’ve been achieving this is by writing a spec alongside the tool, and forcing myself to specify the grammar in EBNF, which is much more modest in it’s power and flexibility. Writing the spec at the same time is a good way to avoid having the language definition just be “whatever the reference implementation does,” which is the case with too many things.

This isn’t a general purpose programming language, so all of my bad experiences may not translate exactly. But writing something that’s too complicated to write again is something I’m wary of, and I’m finding that there’s at least some truth to the notion that less powerful tools can keep us honest.