Dokumi (English) - クックパッド開発者ブログ

Let's talk about Dokumi, a tool I have been recently working on.

Introduction

I am part of Cookpad's "Technical Department". One of the department's goals is to make life easier for other engineers. I am in charge of iOS, so I wrote Dokumi to decrease the time that mobile engineers spend on code reviews.

In Japanese, Dokumi means "food tasting", or more literally "poison tasting" (for example, tasting food before a king to make sure the food is not poisoned). I named this tool "Dokumi" because it "tastes" the code to check if any "poison" (bug, strange code) is in it. In other words, it's an advanced lint tool intended to be run by your CI server. Every time a pull request is created or updated, Jenkins runs Dokumi on it. Currently, Dokumi only supports iOS apps, but we are thinking about adding Android support.

Dokumi currently does the following:

Runs a static analysis of the code (available in Xcode as "Analyze").
Runs your automatic tests.
Points out the XIB/Storyboard that are changed by the pull request but for which only the Xcode version information changed.
Points out if the milestone of the pull request has not been set yet.
To run the static analysis or automatic tests, Dokumi has to build the application, so it also picks up the warnings and errors that occurred during the compilation.

As you can see in the screenshots above, Dokumi posts the issues found as comments to the GitHub Enterprise request. Issues on lines modified by the pull request are added as comments to those specific lines. Issues that are not on a line modified by the pull request but seem to be related are regrouped in one normal comment. Issues that do not seem to be related to the pull request (for example warnings found on a file not modified by the pull request) are ignored.

Merits

Any developer can run Test and Analyze in Xcode by himself. However, it takes time and it is easy to forget to run them. Also it's easy to forget to add some change to git. That is why Dokumi is run on all pull requests. By the way, thanks to static analysis, we did find problems in code that had already be reviewed by humans. So if you never used it, it might be worth a try.

I also think adding remarks directly to GitHub Enterprise's pull requests saves a non-negligible amount of time. Running automatic tests on each pull request was something we had been doing for a while. However, when tests were failing, you had to spend time finding the failure reason in the xcodebuild output inside Jenkins's log.

In fact, at first, I was thinking about putting all the errors in a nice web page, but I could not think of a design I liked, and I did not really want to bother a designer for that. That is where I realized that I could simply do it the same way humans review code.

Obstacles

When I tried implementing that idea using GitHub's API, I realized the current (as of June 2015) version of API makes it difficult to comment on a specific line.

Looking at the documentation for the endpoint to comment on a specific line, you have to pass it "The line index in the diff to comment on.". You first need to map file lines to diff lines. And to do that, first you have to solve an other problem: there is no way to get the diff you need via the GitHub API.

If you add .diff or .patch to a GitHub pull request URL, you can download its diff. However, that diff, and the one used for the pull request's comments, handle file renames differently. In the diff you get by adding .diff/.patch to the pull request URL, a rename becomes:

old file name deletion
new file name addition

In diffs used for pull request comments, a rename is:

file rename itself
addition/deletion of only the lines that changed

When using git from the command line, it is the same difference as passing --find-renames (shortened to -M) or not to git diff.

The GitHub API does not let you get the diff it uses for pull request comments, so what can you do? Dokumi currently clones the repository locally, and using the Rugged library, it generates its own version of the diff for the pull request. Of course if the way the diff is generated is different from the way GitHub does it, the line the comment ends up on might be incorrect, but even it this happens it should not be a big problem.

If a pull request is to merge new-feature into master, the way to display its diff from the command line is the following:

$ git diff --find-renames `git merge-base new-feature master`..new-feature

Q&A

In what language was Dokumi implemented in? Ruby. I have been programing in Ruby for a while, and at Cookpad we have many Ruby engineers, making it a good fit.

How do you detect warnings and errors? Mainly using regular expressions on the output of xcodebuild. For static analysis, the plist files generated by Xcode Analyze are also used.

Do you have any plan to open-source it? Not decided yet. It would be costly to make it work on most environments, and I do not feel it is really useful to open-source something that most people could not use easily. But I would be happy to answer any question about its inner workings. The source code has been made public on GitHub.