First Class Functions in C - Because That's Totally a Good Idea.

04 Sep 2013

Recently, for my own amusement, I wrote I small library in C that allows you to partially apply a function. The code is here, read on for an explanation.

First, a quick clarification of what the library actually does. Suppose you have a function such as write(2), which has a signature like:

    ssize_t write(int fd, const void *buf, size_t count);

The library provides a function curry, which can be used like so:

    write_stdout = curry(write, 1);
    write_stdout(buffer, bufsize);

That is, give it a pointer to a function a and an argument arg, and it will return a pointer to a function b, which takes the same arguments as a except for the first, and which when invoked calls a with arg as its first argument, and passes along the remaining arguments from the caller.

Most languages that support this do so by not having functions be raw pointers to code; Instead they also include some additional data, e.g. extra arguments.

To pull this off in C, without breaking the ABI, we’ll need to do something else. We’ll also need to abandon the notion of portability across machine architectures.

The main idea is actually quite simple: allocate a buffer, and stuff a few instructions in there which adjust the arguments being passed in, adding the first argument in the right place, then jump to the real function. And this is exactly what we do.

The implementation is 32-bit x86 only right now, and it can’t handle arguments that don’t fit inside 4 bytes. The second limitation could be removed, but I haven’t done so yet. Given our implementation strategy, if we want this to work on multiple architectures, we’re pretty much stuck reimplementing it on each machine. We’re also dependant on gcc’s calling convention, and some non-ansi functions for dealing with memory.

Also, it’s very memory hungry, as I was a bit lazy dealing with the memory allocation issues, and just claimed the entire page for the generated function (which actually uses 16 bytes). Also I haven’t provided a way to free the heap allocated function pointer.

In case you hadn’t figured it out, the second half of the title was somewhat sarcastic. Never use this code - If you need partial function application, C is the wrong language for your project.

Anyhow. With gcc on the x86, function arguments are passed on the stack, pushed in revere order. The call instruction itself pushes the return address onto the stack, which is needed to pick up where we left off in the calling function.

So, if we do:

    g = curry(f,a)

Upon executing g, which is our buffer full of instructions, the stack will look like:

| return address |
| b              |
| c              |
| ...            |

Remember, because the arguments are pushed onto the stack in reverse order, the first argument ends up at the top, with the rest below it. The return address ends up on top of the arguments, since it was pushed last, by the call instruction itself. The picture we want when we jump to f is this:

| return address |
| a              |
| b              |
| c              |
| ...            |

This is simple enough to accomplish in assembly; we pop the return address into a register, push the extra argument on to the stack, and then push the return address back on top of it:

    popl %eax
    pushl $<a>
    pushl %eax
    jmp <f>

It’s fine to use eax like this, since the caller is expected to save the contents of the register if needed - the callee is prefectly entitled to trash it.

There are a few housekeeping issues to take care of:

    struct Code {
        uint8_t opcode;
        uint32_t arg;

And decide (correctly) that having arg not aligned on a 4-byte boundary within the struct will make accesses slower. It will “remedy” this by inserting extra space between opcode and arg, which of course will cause problems when we jump to it. We declare our struct with gcc’s packed attributed, which says that no extra space may be introduced:

    struct CurryBuffer {
        uint8_t popeax_opcode;
        uint8_t pushl_opcode;
        uint32_t pushl_arg;
        uint8_t pusheax_opcode;
        uint8_t jmp_opcode;
        uint32_t jmp_arg;
    } __attribute__((packed));

There isn’t much more to it. C’s type system isn’t very expressive, so you’ll get lots of warnings without a bunch of casts, and we have to leave out the argument list in the return type of curry because it simply can’t be specified. Leaving the argument list out has the effect of disabling type checking on the function’s arguments.

There are a lot of reasons you shouldn’t ever use this, but I had fun making it nonetheless.