Distributing CLI Tools via Docker

Throughout my career, I’ve seen a couple recurring patterns related to the tools I write: I write a lot of small CLI tools and I like to share them with my coworkers (and whenever possible, the rest of the world).

This has led to several iterations of solving the problem How do I make this tool easy to run? since I don’t want to burden people with understanding the intricacies of all my tools’ dependencies. These tend to be Ruby, some number of gems, and possibly some other common unix utilities. The solutions I’ve come up with have included a lengthy README with detailed instructions, Bundler with Rake tasks to do all the heavy lifting for non-Ruby things, fpm, and even “curl bash piping” (yes, I’m horrible).

Recently I decided to use Docker to solve this problem, since I’m using it so much anyway. Using Docker has some huge benefits for sharing applications of all types: the dependencies list gets whittled down to just Docker, things work on more platforms, testing gets simpler, and it is the new hotness which makes people say “whoa” and that’s fun. That said, the downsides can be frustrating: working with files on your machine gets messy, more typing with the extra Docker-related preamble, things are less straightforward and clear, simple mistakes can lead to lots of images and containers to clean up, and the executable gets significantly larger (since the Docker image is a whole, albeit lightweight, OS userland to run the app). After weighing these pros and cons, I’ve found that telling a coworker to docker pull registry.url/my/app and run it with --help is so much more convenient than the alternatives.

So what’s the big deal?

Well, Docker allows the developer to perfectly specify how their app’s dependencies should be installed, how they’re configured, and precisely which versions are in place. It also means that if it works anywhere, it will work everywhere. Complete control over the environment in which my tool runs means I can do it right and focus less on making install instructions and more on ease of use.

How is it done?

I only really need to add a few pretty simple components to my codebase:

  • Create a Dockerfile that installs, sets up, and runs my tool.
    • If the tool works with files, I highly recommend creating a VOLUME to serve as a standard location for getting data into the tool. I tend to use VOLUME /src and then set WORKDIR /src. More on this later.
    • The ENTRYPOINT should be just a bare run of my tool, say ENTRYPOINT ["app"] where my tool is app.
    • The CMD should be some sane defaults for command-line flags or arguments, say CMD ["--source", "."] where I’d pass a flag called --source to my tool with an argument of . for the current directory.
    • Bake in good practices, like creating and using a non-root user rather than running as root.
  • Setup some CI goodness to poll my repo (because a tool worth writing is worth version-controlling and testing) and make it publish to a Docker registry on success.
    • Mostly just needs to run my RSpec tests and, if they pass, run docker build -t registry.url/my/app . to build my image.
  • Write README instructions that explain how to get Docker and how to run my app.

If the ENTRYPOINT and CMD are configured as described above and I’m using /src for sharing local data with the container, then it can be pretty easy to share and run the tool.

Part of the README should include a line encouraging users to do something like this:

If the image is at registry.url/my/app, and I want people to run my tool by calling the app command, then this will be pretty clean for a user. If I write an app that requires a source directory to be passed in, just running app will pass in my present working directory ${PWD} on my machine, calling it /src inside the Docker container. Since I specified WORKDIR /src in the Dockerfile, this means that /src will be translated to . inside the container when my app runs. So . on my machine becomes . in the container! Also note the --rm in the command. This means Docker will remove containers after they’ve exited. This will prevent all the extra containers from piling up after each run of my tool.

Putting it all together

I think a quick example is in order. Let’s pretend I’m writing an app to find all the HTML files in a directory that include capital letters in the name (eww). Say I have my app in a directory, and it consists of just an executable and a Dockerfile for building the tool. Here’s a tree of such a directory:

Here’s the contents of by app’s executable (app/bin/app):

Here’s my Dockerfile (app/Dockerfile):

Given the above files, build the Docker image:

Then run it:

You should get this output:

Now try making an alias and running it:

You should get the same output. Now navigate to a directory that has some HTML files in it (or in one of its subdirectories). Try running the command:

If none of your HTML files have capital letters, you should just get returned to your prompt. If there are files with capital letters in their names, you’ll get the relative paths to those files.

As you can see, the application looks like a regular system command, even though it’s buried inside of a 150MB Docker image. Hopefully it is clear why, from an end-user perspective, this is helpful.