This post is a set of observations and reflections on the activity of composing programming languages, as well as other systems with certain key properties. I don’t have much of a point with this post; I’m just musing. It may be a bit meandering in places as a result; I apologize in advance.
Programming languages are the largest and most obvious category of a class of systems which provide a few key facilities:
- A means of abstraction, i.e. wrapping something complex inside of a simpler interface.
- A means of composition, i.e. combining different components to form more complex components.
- A set of primitives; low-level building blocks upon which the rest of the system is built.
One can argue that this is what it means to be a programming language, but that is not my aim. There are other properties that one can argue are also necessary, but which don’t disqualify a system from being in the category discussed here. One obvious example is Turing completeness.
It is quite common for a software developer to find themself in a situation where they need to get two of these systems to talk to one another, i.e. to compose them. What then, is the means for doing this? There are a few common approaches.
Foreign Function Interfaces (FFIs)
Particularly when interfacing with C and C++ libraries from another language, a common approach is to use a mechanism (usually supplied by the other language) for calling functions, methods, and so on directly. This involves linking the two languages into a common address space.
The biggest advantage of this is that it’s usually fast. It’s relatively easy to avoid making whole copies of data, and the two languages can call each other without having to make comparatively expensive system calls. You also end up with a single executable, so you don’t need to worry about starting multiple programs, for example.
There are a few downsides however.
One is robustness: unlike most languages, C and C++ aren’t memory safe, and if the C/C++ code has a memory corruption bug, it can introduce faults in the other language that would normally be impossible.
Another is complexity. If the two languages’ programming models are very different, making direct calls can be clumsy, as you have to deal with mismatches in how the two languages expect to deal with a particular problem. Here are some concrete examples:
- Manual memory management vs. Garbage collection. Things an implementer
needs to worry about here:
- How do you deal with references held by C to collectable objects? You need some way of ensuring that the collector knows they can’t be freed. If the language has a copying garbage collector, you also need to worry about the fact that C won’t react nicely to the references being moved out from under it.
- How do you deal with references to C objects held by the Garbage
collected language? Some of the time this is easy – many (most?)
runtime environments provide a way to register a handler to be run
when an object is collected, and you can put
free()or equivalent here. However, you can hit cases where C is doing some fancier bookkeeping to make the memory management easier to deal with, and getting the interaction right can be finicky. A good example of this is my Go binding to notmuch mail; this was by far the hardest part of that project.
- Conceptual models. Even if the gluing process itself is straightforward, if you’re trying to use an object-oriented library or something that makes heavy use of side effects from a functional programming language, use of the library can get really finicky. A higher-level abstraction over the binding is commonly desirable, but this means more work for the implementer.
- Even if you’re not dealing with entirely different programming paradigms, at a fundamental level you’re going to be dealing with an interface that was designed to be composed and abstracted within a different system. In the best case, you may be stuck with APIs that are clumsy to use, or highly non-idiomatic. pyjnius allows calling Java from python, but you’re still stuck calling a Java API from a language that’s far more flexible, and you’re still stuck with unreasonablyLongNamesThatDontEvenFollowPythonCaseConventionsOrFitOnOneLineInAnEightyColumnTerminal. Asking a python programmer to deal with this, if not actually unreasonable, is at least somewhat cruel.
- Tight coupling. The bridge you’ve built is very aware of both programming models, and often is non-trivial to leverage in building interfaces between anything else.
Just Say No
Sometimes, leveraging existing systems is just not worth the trouble. If you’re working in a language that allows for respectable implementations of the functionality you need, it can actually be easier to just re-write it than to deal with FFI headaches. This is especially common in the Go community. When you start using C libraries you can’t cross compile without setting up another toolchain, depending on the library static binaries can be just off the table, you have to write glue code anyway, and you have to deal with everything in the previous section.
Remote Procedure Calls
Another approach is for two different programs (written in different languages), to communicate over a (usually network) protocol which is designed to transport function calls and the like over a network. Advantages:
- Don’t have to worry about memory model issues. The two programs don’t (generally) share memory so this just doesn’t come up.
- The programs don’t have to be on the same machine. For that reason, this approach is sometimes used even when the two programs are written in the same language.
- It’s possible for the programs to be mutually untrusting.
- Unlike FFIs, these are more loosely coupled; it’s usually not too difficult to swap out the implementation on one end of the connection (even with something in a different language) without any fuss on the other end.
- Can be slow; (typically) you’re at a minimum making a copy of everything you transfer to the other program, and often it actually has to go across a network, so latency can be very high in comparison to a local function call. Having remote procedure calls be synchronous is often not practical. This leaves you with a bit of a mismatch in the programming model, and if one of your languages isn’t great on concurrency it can be a pain.
- Can result in non-idiomatic interfaces, especially in the context of different programming paradigms.
- The basic support libraries and tooling for the protocol can be quite complex.
Some example protocols:
I don’t have as much personal experience with this approach, so I may be missing some important points.
(Other) Network Protocols
Another option is a network protocol that doesn’t try as hard to make the interface look like a programming language. Examples:
- HTTP (REST APIs, yada yada)
- Message busses (AMQP, STOMP…)
- Custom application specific protocols
- Less language-specific bias. This can help alleviate some of the issues around different programming paradigms that we’ve discussed.
- Can be simpler to use. HTTP and STOMP are simple enough that you can just pop open telnet or such and play with things manually. IRC (at least the client end of it) is simple enough that it’s not totally nuts to use telnet as an IRC client. I’ve done it when in some of my stranger moods.
- Can be simpler to implement. For example, Cap'N Proto leaves you with a lot of state to manage, and it can be tough to get around that with a protocol that general. With a stateless protocol like HTTP, you often don’t have to worry as much about single connections tying up large amounts of resources.
- Can be clumsy for more complex tasks.
- Easy to get into a situation where the protocol is being stretched beyond what it was designed for. This is the case with HTTP right now; web applications would often benefit from a richer protocol. Websockets and the like are a common solution to this, but that basically amounts to bypassing the protocol entirely. Not being able to get away from a protocol when it becomes inappropriate can result in some seriously messy systems.
Joe Armstrong wrote a post recently, where (among other things) he advocates for more restrictive protocols. Joe Armstrong invented Erlang, which has an impressive capacity for implementing network protocols. When you’ve got that at your disposal, one-size-fits-all protocols can be more trouble than they’re worth.
He also strongly recommends binary protocols; he says stuff like JSON can be hard to parse, and comparatively I think he’s probably right.
It’s also worth noting that systems that utilize protocols like this lend themselves well to the “Just Say No” approach. With more complex protocols, the cost of implementation can mean it’s easier to use an implementation from another language via an FFI. Now you’re back where you started, and your system has more stuff in it to boot.
Programs As Subroutines
This is the Unix/Plan 9 approach. Use Pipes and the like to hook programs together. Write programs that are composable. There’s one really big key idea here: whole programs aren’t special; you can make calls out to them just like you’d make calls to functions. Unix also has custom languages (shells) specifically designed for gluing programs together. Most languages are capable of composing this sort of program without much trouble.
Also: do input and output with formats that are trivial to parse. Arrays of lines, general lack of nested data structures (have another look at Joe Armstrong’s post; he advocates for this in network protocols) and so on. If you can fit your application into this, you often don’t need much glue code at all; a one-liner at an interactive interpreter is often good enough.
A big disadvantage of this is that it can mean a lot of overhead; spawning processes all the time probably isn’t reasonable for some performance sensitive applications.
Everything Is A File
Again with the Unix/Plan 9. This has some things in common with both “(Other) Network Protocols” and “Programs As Subroutines.” Importantly, it’s designed to mesh well with the latter. It provides an interface that’s fairly language agnostic, and has a lot of the advantages of HTTP. It’s semantics are a bit richer, which can be a double edged sword.
Plan 9 in particular allows for all sorts of services to be exposed as filesystems. This is something that’s often clumsy on Unix, since the flow control that’s typical of long-running services fits poorly with the shell pipeline model. It also has a standard network protocol for the filesystem interface (called 9P), so (1) in a sense it also fits into the network protocol category, and (2), it’s (relatively) easy to write programs that provide this interface. The protocol isn’t as trivial to implement as doing things by spawning processes; you’ll definitely want a support library in place, but it isn’t all that bad. If your program is actually running on plan 9, you don’t need anything special from the client side, since the system call interface does the work for.
9P suffers from the latency problem that Cap'N Proto is careful to avoid. The issue isn’t so much with the protocol itself as the fact that programs tend to use it via the system call interface, which is synchronous.
Some things that keep cropping up:
- Systems that send messages back and forth tend to be easier to work with than systems that use shared resources.
- There’s a big tradeoff between flexible interfaces that allow for rich control flow and interactions, and ones that are simple to implement, and won’t require much work to get a new system to speak.
There are a lot of different approaches to this problem. There are techniques I didn’t talk about, and the design space for connecting these systems is as big as it is for building the systems themselves. Hopefully my reflections have been useful and/or interesting to you. Cheers.