August 8th, 2015

A simple approach to deploying with git without clutter

Today, I created git-create-deploy-branch after kicking some of the ideas around for a couple years.

Git at first seems to be an ideal tool for deploying web sites and other things that don’t have object code. However, it’s never been that simple, and where there’s programming, there’s automating the tedious bits and creating derivative pieces from more humane sources.

With the addition of receive.denyCurrentBranch = updateInstead in git 2.3.0, possibilities opened up for really reliable, simple workflows. They’ve since been refined, with a push-to-checkout hook allowing built objects to be created on the receiving server, but I want a more verifiable, local approach.

There are two main strategies in git for dealing with this, and before git 2.3.0, those were really the only things available. In the first, git holds only the source material, and any built products are managed outside of git, whether as a directory of numbered tarballs or in a service meant for such things. Some services like the npm registry bring a lot of value, with public access and hosting and replication available; some are little more than object storage like Amazon S3. In the second approach, built products are committed back, and git becomes a dumb content tracker – conflicts in built files are resolved by regenerating them from merged source material, and the build process becomes integral to every operation on the tree of files.

I’ve long wanted a third way, using the branching, fast, and stable infrastructure of git, while keeping the strict separation of source material and built material. I want to be able to inspect what will be deployed, and inspect the differences between what was deployed each time, and separately, analyze the changes to the source material, yet still be able to relate it to the deployed, built objects. To that end, this tool can be considered a first attempt at building tools that understand the idea of a branch derived from another.

The design is simple enough: given a branch (say master) checked out in your repository, with a build process for whatever objects need to exist in the final form, but those products ignored by a .gitignore file, like so:

source.txt:

aGVsbG8sIHdvcmxkCg==

and a build script:

build.sh:

#!/bin/sh

base64 -D < source.txt > built.txt

and an ignore file, with both the built object and other things like editor cruft:

.gitignore:

built.txt
*.swp
*~

we create a file listing the files to skip excluding when creating the derived branch, like so:

.gitdeploy:

built.txt

The initial version of the tool is very simple, and doesn’t support wildcards or any other features of any complexity in the .gitdeploy file. This is not out of a strong opinion, but as a matter of implementation simplicity, given that my prototype is written using bash.

You can install it with npm:

npm install -g git-create-deploy-branch

To create the deploy branch, we’ll run the build, then create the deploy branch with those objects present in our working directory:

./build.sh && git create-deploy-branch

Our first run gives output like so:

[new branch] 8acba8787306 deploy/master

and a branch deploy/master is created, in this case with commit ID 8acba8787306. We can show that it includes the built files:

:; git show deploy/master
commit 8acba87873062dd8b4fc516bab581a450bf9e077
Author: Aria Stewart <aredridel@nbtsc.org>
Date:   Sat Aug 8 22:30:05 2015

    deploy master

diff --git built.txt built.txt
new file mode 100644
index 000000000000..4b5fa63702dd
--- /dev/null
+++ built.txt
@@ -0,0 +1 @@
+hello, world

The commit also has the parent commit set to the current commit on master, so we can track the divergence between master and deploy/master, both expected (with the built objects) and unexpected (errant commits made on the deploy branch).

Let’s update our source, and commit that:

source.txt:

aGVsbG8sIHdvcmxkOiB3ZSBoYXZlIGNhbmR5Cg==

The repository now looks something like this:

:; git graph master deploy/master
* e391cd8deb5e - (HEAD -> master) New source (8 seconds ago) <Aria Stewart>
| * 8acba8787306 - (deploy/master) deploy master (5 minutes ago) <Aria Stewart>
|/
* 8045ecf53520 - Add .gitdeploy (5 minutes ago) <Aria Stewart>
* 0a347a1892a6 - initial commit (5 minutes ago) <Aria Stewart>

And if we run the build and deploy again:

./build.sh && git create-deploy-branch

We get output like so:

8acba8787306..16663a3ae945 deploy/master

And our repository now includes a new merge commit, showing the origin of the deployed objects, and the prior deploy:

*   16663a3ae945 - (deploy/master) deploy master (68 seconds ago) <Aria Stewart>
|\
* | e391cd8deb5e - (HEAD -> master) New source (3 minutes ago) <Aria Stewart>
| * 8acba8787306 - deploy master (7 minutes ago) <Aria Stewart>
|/
* 8045ecf53520 - Add .gitdeploy (7 minutes ago) <Aria Stewart>
* 0a347a1892a6 - initial commit (8 minutes ago) <Aria Stewart>

On a remote machine, let’s create a deploy repository, set it up to receive our deploys, and add it as a remote for us.

ssh remotemachine 'git init show-off-build && cd show-off-build && git config receive.denyCurrentBranch updateInstead && git checkout -b deploy/master'

git remote add remotemachine ssh://remotemachine/~/show-off-build

Now we can deploy this with a simple command:

git push remotemachine deploy/master

So in total, deploying a new derivative of our source code consists of making our changes and committing them, then running the build and the command to create the deploy branch, then pushing:

git commit -m changes
./build.sh && git create-deploy-branch && git push remotemachine deploy/master

Stable, traceable, reliable, replicatable builds and deploys, stored in git but not cluttering the source branch.

Let’s see our handiwork:

ssh remotemachine cat show-off-build/built.txt

And the response?

hello, world: we have candy