Github Code Review Workflow

27 Sep 2015

Last week I started writing a go language binding for notmuch mail. I got some structure in place and wrapped enough of the api to implement a simple demo, then put it down for a few days. This week, someone stumbled across it and started submitting patches. Having left my job, it’s been a few weeks since I’ve spent much time doing code review, and I was forgetting how lousy Github’s pull requests are for that sort of thing.

My biggest gripe is that if the submitter rebases, or otherwise destructively modifies history, the old versions of the code, including in-line comments, are gone. This sounds almost tautological (destroying history destroys history), but it’s a problem other code review workflows don’t have, and a serious one. When the submitter makes a change to the pull request, there’s no way for me to easily see what the change was. At my old job, we dealt with this by just not rebasing, and putting up with a sometimes messy history.

Gerrit will show old versions of the patch when a new one is updated. If you’re using a workflow built around git-format-patch and git-am, submitting a patch is essentially just sending an email with a patch attached, so access to previous submissions comes for free. As much as I like the pure email based workflow, using pull requests makes for a lower barrier of entry for potential contributors, which is usually worth it. I’ve come up with a workflow for dealing with these problems in the context of Github pull requests which I’m fairly happy with, so I thought I’d share. Here’s how it works:

First, set up a remote for each new contributor’s fork. So far on this project there’s just one:

$ cd go.notmuch
$ git remote add kalbasit git://github.com/kalbasit/go.notmuch

Now, each time this contributor submits a pull request, fetch the changes from their fork:

$ git fetch kalbasit

Then, create a new branch based on the branch used by the pull request. Call it prN.1, where N is the pr/issue number on github:

$ git checkout -b pr6.1 kalbasit/implement_all_database_functions

From here you can pick through the pr as usual, making comments and asking for adjustments. Once the submitter has addressed your comments, and it’s time for you to have another look, fetch the latest stuff from their fork, and then create a new branch based on the current state of their branch:

$ git fetch kalbasit
$ git checkout -b pr6.2 kalbasit/implement_all_database_functions

This cycle repeats until the code review is done. If your submitter rebased, you can easily see what changed by popping open a terminal and doing:

$ git diff pr6.3 pr6.4

You have the full history of each version of the pull request, which makes things much easier.

This doesn’t solve everything of course. These branches are now only on your local machine. If you only have one machine, or you’re more organized than I am, this might be acceptable. You could also just push them to the main fork, which has some nice properties (especially when you have multiple potential reviewers), but means anything anyone submits is going to be in the mainline repo. What I’ve ended up doing is pushing to a private mirror on Gitlab, which I sync with each of my local machines. It works well enough.

It also leaves unaddressed some of my other gripes. For example, you pretty much have to go to the website to make comments; You can reply to email, but it doesn’t give you much context and if there isn’t already an in-line comment at a particular spot you’re out of luck.

All the same, this has proven a much nicer way to deal with code review than what I’ve done with Github in the past. It doesn’t address some of my other gripes, but it’s enough of an improvement that I thought it was worth documenting.