Creating your own tiny static publishing platform

I’ve been using static publishing platforms for a while now. The output is enduring and easily archived, and reliable and robust. As an author, there’s also a lot of truth to the unreasonable effectiveness of GitHub browsability however much I disagree with the philosophy therein of committing build products in with the sources. I’ve used Jekyll; Hexo, which is what I use to write this blog; I’ve used Movable Type long ago.

However, all these systems are more complex than I’d like, and prone to bit-rot, far far faster than the content they generate. Runtimes change. Dependencies rot as maintainers move on and no longer can account for those runtime changes. Development moves on to new major versions or being built with a newer fad in software design. Hexo has treated me better than most, but it is large, and the configuration rather arbitrary in places. Plugins have to be written specifically for Hexo, so there’s a balkanized ecosystem that doesn’t flourish as well as other parts do. All these static publishing tools tend to have things in common. Builds have to happen as quickly as they can, and usually this is a bit too slowly. The author will want to preview their work in context, so serving up the rendered pages is important. Live rebuilds by file monitoring reduce friction in the workflow for some people, though I personally don’t care much for it, preferring to run a build when I’m ready.

It turns out that building derived things from a list of inputs with dependencies is a thing that computers have been told to do for a long time. Nearly all compiled software is built this way. We have tools like make(1) and a host of other, more complex and less general tools for various programming languages. I’ve always wondered why we didn’t use those to build sites as well. People have, it turns out, but make(1) in particular is a bit messier for the task than one would hope. There are other tools, and I settled on building with one called tup

This weekend I built a small static publishing platform, and you can too. I wanted to build a site using Tufte CSS, and the minimalism of the presentation is a great fit for a super tiny static publishing platform.

A site like this needs to output:

  • Each post as an HTML file
  • An index page listing posts
  • Its CSS and any assets needed to render

This really isn’t a huge list.

First, let’s reach for a tool that can take a list of files and build all the derived things. make(1) is annoying here, because you have to tell it what to build, and it backtracks and figures out how to make it. We don’t actually have that information easily encoded, but we will have a list of sources, and can make a list of what to do with them. If you’re writing, you probably have a reason for it, right? Or an asset, it’s going to get used, why else would it be there? Starting at the source makes a lot more sense, and as it turns out, it makes incremental builds a lot faster. Enter our first player: tup.

$ brew cask install osxfuse
$ brew install tup

I’m not sure why tup now depends on FUSE, but that’s a task for another day.

Let’s start a directory for our project.

$ mkdir my-static-site
$ npm init
$ mkdir posts

Make a sample markdown file in the posts directory.

Next we create a Tupfile to describe how we’re going to build this site. Then we can just type tup to build the site, or tup monitor on Linux for that live building mode. First, let’s handle each post as HTML. We can use an off the shelf markdown renderer at first.

$ npm install marked

Here’s a Tupfile

: foreach posts/*.md |> marked %f -o %o |> public/%B.html

This means that for each post in the posts directory, we’ll make an equivalent HTML file.

Let’s take a look at some of these rendered files. We’ll need to serve this directory by HTTP if we want to see it as we will on the web.

$ tup
$ npx serve public/

We can now open the site preview at the URL it spits out (usually http://localhost:5000)

Just a directory full of HTML, and ‘full’ is just our one test post, but we should be able to navigate to one. We have a static site, if a lousy one! That HTML is pretty spartan, so let’s add some assets.

Copy the et-book directory of fonts from the Tufte CSS package into the root of the project, and the tufte.css file.

Let’s add a few rules to publish those as part of the site, too. Added to the Tupfile:

: foreach et-book/et-book-bold-line-figures/* |> cp %f %o |> public/%f
: foreach et-book/et-book-display-italic-old-style-figures/* |> cp %f %o |> public/%f
: foreach et-book/et-book-roman-line-figures/* |> cp %f %o |> public/%f
: foreach et-book/et-book-roman-old-style-figures/* |> cp %f %o |> public/%f
: foreach et-book/et-book-semi-bold-old-style-figures/* |> cp %f %o |> public/%f
: foreach *.css |> cp -r %f %o |> public/%b

Run tup again.

The assets got copied in. Now we have to actually put them in the HTML. That’s going to mean templates.

ejs is simple enough and behaves tidily and doesn’t have a lot of dependencies, so let’s use that for output templates.

$ npm install ejs

We’re going to have to create a script to render our markdown and template the file.

Let’s call this render.js:

const marked = require('marked')
const ejs = require('ejs')
const { promisify } = require('util')
const { readFile, writeFile } = require('fs')
const readFileAsync = promisify(readFile)
const writeFileAsync = promisify(writeFile)
const path = require('path')

main.apply(null, process.argv.slice(2)).catch(err => {
console.warn(err)
process.exit(1)
})

async function main(layoutFile, templateFile, postFile, outputFile) {
const layoutP = readFileAsync(layoutFile, 'utf-8')
const templateP = readFileAsync(templateFile, 'utf-8')
const contentP = readFileAsync(postFile, 'utf-8')

const content = marked(await contentP)
const layout = ejs.compile(await layoutP)
const template = ejs.compile(await templateP)

const dest = path.basename(postFile).replace(/\.md$/, '.html')

const body = template({ content, require })
const rendered = layout({ content: body })

await writeFileAsync(outputFile, rendered)
}

It expects two templates: a layout (the skeleton and boilerplate of the page) and a template (the post template). Let’s create those now.

layout.ejs:

<!doctype html>
<html>
<head>
<meta charset='utf-8'>
<link rel='stylesheet' href='tufte.css'>
</head>

<body>
<%- content %>
</body>
</html>

and post.ejs:

<section>
<%- content %>
</section>

And in the Tupfile, let’s replace the marked render with our own. Additionally, let’s tell tup that the HTML depends on the templates, so if those change, we update all the HTML.

: foreach posts/*.md | layout.ejs post.ejs |> node render layout.ejs post.ejs %f %o |> public/%B.html

Let’s run tup again and see the output. Much prettier, right?

Now about that index! The index needs to know the post’s title, and really, posts don’t even have titles yet. Let’s add some to our test post as YAML front matter. Add this at the top of the markdown file.

----
title: My Post
date: 2017-12-04 01:51:43
----

Every post gets a title and the date.

Let’s change our renderer to put the title on the page so we don’t have to reduplicate it.

Install front-matter

$ npm install front-matter

And update render.js

const marked = require('marked')
const ejs = require('ejs')
const { promisify } = require('util')
const { readFile, writeFile } = require('fs')
const readFileAsync = promisify(readFile)
const writeFileAsync = promisify(writeFile)
const frontMatter = require('front-matter')
const path = require('path')

main.apply(null, process.argv.slice(2)).catch(err => {
console.warn(err)
process.exit(1)
})

async function main(layoutFile, templateFile, postFile, outputFile) {
const layoutP = readFileAsync(layoutFile, 'utf-8')
const templateP = readFileAsync(templateFile, 'utf-8')
const contentP = readFileAsync(postFile, 'utf-8')

const post = frontMatter(await contentP)
const content = marked(post.body)
const layout = ejs.compile(await layoutP)
const template = ejs.compile(await templateP)

const dest = path.basename(postFile).replace(/\.md$/, '.html')

const body = template(Object.assign({ }, post.attributes, { content }))
const rendered = layout(Object.assign({ }, post.attributes, { content: body }))

await writeFileAsync(outputFile, rendered)
}

And to post.ejs, the title.

<h1><%= title %></h1>

And in layout.ejs, let’s add a title tag too.

<title><%= title %> — My Blog</title>

Run tup again and let’s check our work.

Now a little harder part. Let’s make the index page.

We’ll need a script to generate it, index.js:

const ejs = require('ejs')
const fm = require('front-matter')
const path = require('path')
const { promisify } = require('util')
const { readFile, writeFile } = require('fs')
const readFileAsync = promisify(readFile)
const writeFileAsync = promisify(writeFile)

main.apply(null, process.argv.slice(2)).catch(err => {
console.warn(err)
process.exit(1)
})

async function main(outputFile, layoutFile, templateFile, ...metadataFiles) {
const tP = readFileAsync(templateFile, 'utf-8')
const lP = readFileAsync(layoutFile, 'utf-8')

const metadata = await Promise.all(
metadataFiles.map(
f => readFileAsync(f, 'utf-8')
.then(fm)
.then(e => Object.assign(e.attributes, { dest: path.basename(f).replace(/\.md$/, '.html')} ))))

metadata.sort((a, b) => {
a = new Date(a.date)
b = new Date(b.date)
return a>b ? -1 : a<b ? 1 : 0
})

const layout = ejs.compile(await lP)
const template = ejs.compile(await tP)

const rendered = layout({
title: 'Posts',
content: template({ metadata })
})

await writeFileAsync(outputFile, rendered)
}

And an index.ejs:

<h1>My Blog</h1>
<section>
<% metadata.forEach(entry => { %>
<p>
<a href='<%= entry.dest %>'><%= entry.title %></a>
</p>
<% }) %>
</section>

And in our Tupfile:

: templates/layout.ejs templates/index.ejs posts/*.md |> node index %o %f |> public/index.html

Run tup once more and we should have a bare-bones site.

Let’s add one more thing before we go, some dates to the posts.

To the template calls in both render.js and index.js, let’s add the require function, so that templates can require their own stuff.Where there’s template({ metadata }), let’s change that to template({ metadata, require })

Then, let’s install fast-strftime.

$ npm install strftime

An expanded index.ejs:

<% const strftime = require('fast-strftime') %>
<h1>My Blog</h1>
<section>
<% metadata.forEach(entry => { %>
<p>
<a href='<%= entry.dest %>'><%= entry.title %></a> <%= date ? strftime('%Y-%m-%d', date) : '' %>
</p>
<% }) %>
</section>

And the page template, post.ejs:

<% const strftime = require('fast-strftime') %>

<h1><%= title %></h1>

<% if (date) { %>
<p>posted <%= strftime('%Y-%m-%d', date) %></p>
<% } %>
<section>
<%- content %>
</section>

Run tup once more, and you’ve got a static site, being generated by some simple code.

Why not Babel?

People always get really enthusiastic about babel.

I get it. Using all of ES6 plus whatever stuff you want to throw at it is cool.

However, consider this:

:; npm i string-tokenize
+ string-tokenize@0.0.6
added 61 packages in 5.212s

:; npm ls
t@1.0.0 /Users/aredridel/Projects/t
└─┬ string-tokenize@0.0.6
├─┬ babel-plugin-transform-object-rest-spread@6.26.0
│ ├── babel-plugin-syntax-object-rest-spread@6.13.0
│ └─┬ babel-runtime@6.26.0
│ ├── core-js@2.5.1 deduped
│ └── regenerator-runtime@0.11.0
├─┬ babel-polyfill@6.26.0
│ ├── babel-runtime@6.26.0 deduped
│ ├── core-js@2.5.1
│ └── regenerator-runtime@0.10.5
├─┬ babel-register@6.26.0
│ ├─┬ babel-core@6.26.0
│ │ ├─┬ babel-code-frame@6.26.0
│ │ │ ├─┬ chalk@1.1.3
│ │ │ │ ├── ansi-styles@2.2.1
│ │ │ │ ├── escape-string-regexp@1.0.5
│ │ │ │ ├─┬ has-ansi@2.0.0
│ │ │ │ │ └── ansi-regex@2.1.1
│ │ │ │ ├─┬ strip-ansi@3.0.1
│ │ │ │ │ └── ansi-regex@2.1.1 deduped
│ │ │ │ └── supports-color@2.0.0
│ │ │ ├── esutils@2.0.2
│ │ │ └── js-tokens@3.0.2
│ │ ├─┬ babel-generator@6.26.0
│ │ │ ├── babel-messages@6.23.0 deduped
│ │ │ ├── babel-runtime@6.26.0 deduped
│ │ │ ├── babel-types@6.26.0 deduped
│ │ │ ├─┬ detect-indent@4.0.0
│ │ │ │ └─┬ repeating@2.0.1
│ │ │ │ └─┬ is-finite@1.0.2
│ │ │ │ └── number-is-nan@1.0.1
│ │ │ ├── jsesc@1.3.0
│ │ │ ├── lodash@4.17.4 deduped
│ │ │ ├── source-map@0.5.7 deduped
│ │ │ └── trim-right@1.0.1
│ │ ├─┬ babel-helpers@6.24.1
│ │ │ ├── babel-runtime@6.26.0 deduped
│ │ │ └── babel-template@6.26.0 deduped
│ │ ├─┬ babel-messages@6.23.0
│ │ │ └── babel-runtime@6.26.0 deduped
│ │ ├── babel-register@6.26.0 deduped
│ │ ├── babel-runtime@6.26.0 deduped
│ │ ├─┬ babel-template@6.26.0
│ │ │ ├── babel-runtime@6.26.0 deduped
│ │ │ ├── babel-traverse@6.26.0 deduped
│ │ │ ├── babel-types@6.26.0 deduped
│ │ │ ├── babylon@6.18.0 deduped
│ │ │ └── lodash@4.17.4 deduped
│ │ ├─┬ babel-traverse@6.26.0
│ │ │ ├── babel-code-frame@6.26.0 deduped
│ │ │ ├── babel-messages@6.23.0 deduped
│ │ │ ├── babel-runtime@6.26.0 deduped
│ │ │ ├── babel-types@6.26.0 deduped
│ │ │ ├── babylon@6.18.0 deduped
│ │ │ ├── debug@2.6.9 deduped
│ │ │ ├── globals@9.18.0
│ │ │ ├─┬ invariant@2.2.2
│ │ │ │ └─┬ loose-envify@1.3.1
│ │ │ │ └── js-tokens@3.0.2 deduped
│ │ │ └── lodash@4.17.4 deduped
│ │ ├─┬ babel-types@6.26.0
│ │ │ ├── babel-runtime@6.26.0 deduped
│ │ │ ├── esutils@2.0.2 deduped
│ │ │ ├── lodash@4.17.4 deduped
│ │ │ └── to-fast-properties@1.0.3
│ │ ├── babylon@6.18.0
│ │ ├── convert-source-map@1.5.1
│ │ ├─┬ debug@2.6.9
│ │ │ └── ms@2.0.0
│ │ ├── json5@0.5.1
│ │ ├── lodash@4.17.4 deduped
│ │ ├─┬ minimatch@3.0.4
│ │ │ └─┬ brace-expansion@1.1.8
│ │ │ ├── balanced-match@1.0.0
│ │ │ └── concat-map@0.0.1
│ │ ├── path-is-absolute@1.0.1
│ │ ├── private@0.1.8
│ │ ├── slash@1.0.0
│ │ └── source-map@0.5.7 deduped
│ ├── babel-runtime@6.26.0 deduped
│ ├── core-js@2.5.1 deduped
│ ├─┬ home-or-tmp@2.0.0
│ │ ├── os-homedir@1.0.2
│ │ └── os-tmpdir@1.0.2
│ ├── lodash@4.17.4
│ ├─┬ mkdirp@0.5.1
│ │ └── minimist@0.0.8
│ └── source-map-support@0.4.18 deduped
├─┬ chai@3.5.0
│ ├── assertion-error@1.0.2
│ ├─┬ deep-eql@0.1.3
│ │ └── type-detect@0.1.1
│ └── type-detect@1.0.0
└─┬ source-map-support@0.4.18
└── source-map@0.5.7

:; npm rm string-tokenize
removed 61 packages in 1.356s

:; npm i @aredridel/string-tokenize
+ @aredridel/string-tokenize@1.0.0
added 1 package in 1.317s

:; npm ls
t@1.0.0 /Users/aredridel/Projects/t
└── @aredridel/string-tokenize@1.0.0

This is roughly the same code. I ported it to not be written using ES6 Modules, used core node assert instead of chai (It has the same functionality being used!), and removed Flow type annotations. It works in node 8 easily, and should work in node 4.

I work in constrained environments: page load time is very important to me. If I’m loading even a fraction of this in a browser, I’ve blown my budget. I run a bunch of hobby projects on a very inexpensive server. RAM is at a premium. All of these things have costs.

A small failure, tracing complex causes, and the ethics of software design

Today I had an interview; not the super intense kind, but grab coffee with a recruiter, chat about goals and desires and see if companies she represents are a match for my skillset.

I missed the appointment. There’s myriad reasons, including my being bad with dates and time in general, but I’ve got a system that usually works for me. I delegate carefully to my computer and set every appointment to vibrate my phone. However, this particular event was vexed from the beginning by a several failures, each individually insufficient to make me miss the appointment, but together did the job nicely.

We chatted briefly the other day to set up the appointment. I put it in my calendar, and she sent me a calendar invite via Google calendar to my personal email address (which is not a gmail account, however it is the email I sign in to Google with). Failure number one: There’s no way not to have a google calendar, so Google auto-added it to my calendar there, and I strongly suspect events are considered somewhat confirmed (at least receipt of invite shown) at this point.

Her invite had more information (like location) than my hand-entered entry, so I opened the .ics file that Google emails to my address when someone sends me a Google calendar invite. I use Apple’s Calendar app, since it’s got a much faster user interface than Google calendar, and syncs with iCloud quite nicely. It’s a tiny bit more in my control than Google is. When I opened the .ics file, it added it to my calendar on the screen, and I deleted my copy of the event.

Failure number two: events added from an .ics file sent by Google can’t be edited. Including the ability to set up a notification.

Failure the third: immediately after, I get a message from the Calendar app that it couldn’t sync the event, error “403” (HTTP for “Forbidden”, which in this case tells me about as much as the word “potato”). Apple has chosen a protocol called CalDAV for its calendars, and has not put effort into making sure all the error messages are meaningful. It then presents me with three opaque options: “Retry”, “Ignore” and “Revert to Server”. The first fails with 403 again. The second will leave the entry on my computer, but not sync it to iCloud, and I only know this from a little experimentation and knowledge of how these systems work under the hood. The third removes the entry from my calendar. Failure the fourth: none of these options are useful. I eventually ignore the error and set about making it work right.

I copy and paste the event to another calendar in the Calendar app. This time it works, and I copy it back to the correct calendar, the one I have set up to sync to iCloud and my phone. It works. Or so it seems. I move on with my day. I have an event in Calendar, that hasn’t given me a sync error, that has a notification, and the time, date and location of my meeting. It does, however, try to send an invite to the recruiter who invited me, making a second meeting at the exact same time and place. I decline to do so. Points to Apple for giving me the option.

This morning I wake up, glance at my phone’s calendar, see I have no events until afternoon, and sleep late. I miss my appointment.

Failure the fifth: It turns out, that appointment didn’t sync to the phone. I had checked the original, hand-entered appointment, since I’m insecure about calendars, but that one got deleted way up at the start of this fiasco. The app I use to synchronize an android phone with an iCloud calendar is, while a little ugly in the user interface is a normally robust piece of software that has not betrayed me, until today. There was no error, and so I don’t know whether this event didn’t sync fully in some way or whether the sync program is broken even though it shows my event later in the day. It shows on my husband’s phone, who subscribes directly via iCloud since it’s an Apple device. It made it to Apple’s servers.

Failure the sixth: My computer froze last night, and so, it also did not show any hint that I might have an event today.

All in all, I missed a relatively trivial event. However, if this had been a later interview, this may well have cost me a job. This is where the ethics of software design come in. These are all failures of engineering, and some of them quite forseeable. Software must plan to have bugs, to fail gracefully. The failure case here was silent, and may well be costly to users who experience it. However, at the end of the day, there is no accountability: aside from the chance they read this blog post, engineers at Apple and Google will never know about this failure. I have no options for managing this data that do not involve third parties short of hand-entering calendar entries into multiple devices.

There were also number of preventable failures, mostly in the design of these pieces of software.

  • Why can I not have a Google Calendar, and interact with Google Calendar users entirely by email? They seem awfully certain I’ve received invites when I have not, though in this case that part worked out.
  • Apple’s engineers did not account for getting error messages to humans, and so we end up with opaque, low-level errors like “403” with no meaning and no way to correct whatever condition caused them. We just guess at what might be wrong and try to act accordingly. I may well have guessed wrong.
  • Apple’s calendar program is not designed as a distributed system. It assumes networks are reliable, bugs do not exist, and that errors are transient. The reality is that none of these things are true. Its design does not expose details of what it’s doing, does not expose the state of sync clearly, and does not let you inspect what’s going on. It sweeps its design flaws under a very pleasant user interface rug.
  • Google’s dominance of the industry has left users with few working alternatives, and its products do the bare minimum to interoperate, if at all, and usually only when Google owns the server portion. Their calendar application on my phone does not speak the standard protocols used by Apple.
  • Apple’s extensions to CalDAV with push notifications for added events are also private, and third-party applications cannot use those features.
  • None of these applications center the user’s agency and let them make a fallback plan when these services fail, and these services do fail, often silently.

My needs are modest: enter events in calendar on whichever device I’m using, particularly the ones with good keyboards. Have my phone tell me where I need to be.

Modes of analysis that surface these kinds of design and user experience issues are central to designing good applications. It’s highly technical work, requiring the expertise of engineers and designers, especially as evaluating potential solutions to these design problems is part of the task.

Centering ethics in the design would have changed the approach most of these engineers took in the design of these applications. Error messages would have been a focus. A mode for working when the network is down or server is misbehaving may have been created. A trail of accountability to diagnose the failure would have been built. Buzzwords like ‘user agency’ aren’t just words in UX design textbooks (though they should be), but the core of the reason software exists. Engineering that centers its users, analyzes their needs, and evaluates the ways potential solutions fail and solves those problems is what engineering should be.

My apologies to the recruiter I stood up today, I hope you enjoyed a latte without me, and talk to you Monday.

Radical Modularity

Here’s a question: What if everything were a module?

This post is derived from a talk I gave at Web Rebels 2016.

What is a module?

I’m actually going to spend some time on this one because while it’s an everyday word in our industry, it’s one we don’t often hear defined. I want you to think about what it means to make software modular.

A module is a bit of software that has an interface defined between it and the rest of the system.

This is one of the simplest definitions I could come up with. There are some implications here: There’s a separation between the module and the rest of the system. I’m not saying how far, but it’s actually a separate entity. I will get into what “interface” can mean later in this post. The bit I think is really interesting is the word defined. This means we’ve made decisions in making a module. Extracting something blindly into a separate file probably only counts on a technicality. Defining something is intentional.

I’m a programmer working for a package manager company, and I think of code as art, and I’ve been making open source my entire professional life, so I’ve also got a particular bunch of things I also mean when I say module.

A piece of software with a defined interface and a name that can be shared with or exposed to others.

I won’t advocate sharing everything, since I’m talking about radical modularity and not radical transparency here, but I want the option. The rest, though, are where things get interesting. In particular, I want to talk about names.

When we name something, it takes on a life of its own. It’s now an object in its own right. This happens when we name a file, it happens when we name a package. A name is the handle we can grab onto something with mentally and start treating it independently.

A defined interface is the first step of independence. It’s the boundary that gives a thing a separate internal life and external life. Things outside a module get a relationship with the boundary, and inside the module, anything not exposed by the boundary can be re-arranged and edited without changing those relationships.

I named her. The power of a name. That’s old magic. —Tenth Doctor, “The Shakespeare Code”

Not every module even gets published or becomes a package on a package registry like npm or crates. We usually push things to GitHub early, but source control isn’t quite the same thing as publishing things for others to use. Just separating things into a separate file — there’s the naming — and choosing what constitutes the interface to the rest of the system is modularizing.

We can commit to names more firmly by publishing and giving version numbers, and breathe life into something as a fully separate entity, but that’s not required, and that alone isn’t often enough to make a whole project.

Self-sustaining open source projects have to be bigger than tiny modules, and so you can either enlarge modules until they become self-sustaining, or your project is a group of related modules, like Hoodie, where there are a bunch of small, named parts.

There’s another option, which is to make modules so small they trend toward finished, done, names for a now-static thing. Maybe they are bestowed upon someone who tidies them up, finishes a few pieces that we left ragged, maybe just left in a little library box for someone else to discover. Maybe they’re published, maybe they’re widely used and loved, maybe not. Maybe they end up in a scrap heap for our later selves or others to build something new from.

Art does not reproduce what is visible; it makes things visible. —Paul Klee

Something the open source movement did that isn’t all that widely acknowledged is make a huge ecosystem for the performance of software as a social art. Not only that, but since then, the explosion of social openness in the creation of software has created a new, orthogonal movement less concerned with copyright law and open engineering, but open sharing of knowledge and techniques, and as a side effect of that and the rise of the app, software engineering now includes the practice of software as art and craft.

I practice code as an art.

A good portion of that is making concepts visible and ultimately that often means making it named. With art, though, there’s some tension with engineering: sometimes we do things to show instability, to test a limit, or to reveal the tensions within our culture or systems we build. We can create a module of code only to abandon it once it‘s served its cultural purpose — be it connecting two things together, mashup style, or just moving on because there’s a better way to do things.

One of the interesting differences about software artifacts as a medium of art contrasted with other fine arts is that despite working in a very definite medium, though abstract, much of what we make is never finished. It exists in our culture — yes, software creation is a reflection of our culture, and a culture of its own — and as web workers, especially working as artists, a lot of what we create straddles the lines of engineering, fine art, performance art, and craft.

Sometimes, too, a destructive act can be an artisic act: in the unpublishing of left-pad, Azer Bike revealed that some of us have been choosing dependencies with little thought, and revealed just how interdependent we are with each other when we work in the open and rely on each other’s software.

So it goes that even the biggest pieces of software are made up of smaller parts. It’s the natural way to approach problems and make them solvable. A large module is nothing more a collective name for little modules that may not have their full names and final forms.

I really like small modules as a norm, because I think of things in terms of named objects. I’m happy to abandon a thing I think no longer suits, and it’s easier to abandon a small module than a big one. They approach done, so I’m happy to use a three year old tiny module, but a big project that’s three years unmaintained is likely to be bug-ridden and poorly integrated.

Back to the thesis here:

What if we make everything a module?

What happens when we break off pieces and name them well? What happens when we do and when we can, publish them, share those names and let others wield that power over them? What does this do to our culture as programmers?

Practical approach to building modularity

I talk about npm a lot, but this can be extended: open your mind and projects and think about making interfaces around new things.

It’s quite possible to take the module system that node uses and extend it to new contexts, and as we’ve seen with projects like browserify, it’s possible to keep the same abstract interface but package things up in new ways for contexts they were not designed for.

Modularizing CSS

When I started at npm we had a monolith of old CSS built up, like most web projects start accreting – a lot of styles we weren’t positive were unused, a lot of pieces and parts that depended on each other. Since then and with a huge shout-out to Nicole Sullivan and her huge body of work on this, in particular go watch her talk or read “Our best practices are killing us”, we’ve started tearing apart the site and rebuilding it with, get this, modules of CSS, with defined interfaces between them and the rest of the system.

They all have names — package names — and versions. So we’ll have a module like @npmcorp/pui-css-tables (PUI is because we forked this system from a component system used at Pivotal Labs)

In this case we’re using a tool called dr-frankenstyle. It’s pretty simple. It looks at all the node modules installed in our web project and then concatenates any CSS they export with a style property in the package metadata, in dependency order.

This means our CSS actually has dependencies annotated into it, in the package metadata, and it’s in pieces and parts. Because of this, and because it’s named, we can start grappling with these things individually, and start making sense of what otherwise becomes a huge mess.

There’s another project called atomify-css that can do similar things, and both of these systems will do one set of fixups as they build a final CSS stylesheet: they identify assets that those stylesheets refer to, and copy those over and adjust the path to work in the new context. Atomify in particular has modules for several languages that bring this style of name resolution.

This turns out to be super powerful, because now it leads us into wanting to modularize and make explicit the dependencies between all the things.

Now, CSS has some pitfalls: browsers still see all of a page’s CSS as a single namespace, a single heap of rules to apply. This isn’t a clean interface, so modularizing everything doesn’t automatically solve all your problems. It can give us some new tools though.

Modularizing SVG

<svg>
<use xlink:href="./wonky.svg#camera-lens"/>
</svg>

What happens if we put SVG files into packages? What’s the interface to an SVG? The text? Parsed XML? Just the file name?

<svg>
<use xlink:href="@stoppard/hound/props/chocolates.svg#chateu-neuf-de-pape"/>
<use xlink:href="./wonky.svg#camera-lens"/>
</svg>

We’ve got dependencies. SVG files can load other SVG files. xlink attributes could be followed, and postprocessing tools could inline those, making production-ready and browser-ready SVGs from more modular ones.

Now that we have HTML5 support in browsers, we can embed SVG directly into HTML, too.

That brings me to…

Modularizing Templates & Helpers

We don’t just build raw HTML anymore, but we use templating systems to break those apart. What if we published those as packages for others to consume, what if we made them modules?

In the process of reworking npm’s website, we had components of CSS whose interface is the HTML required to invoke them: A media block has a left or right bit of image, and then some text alongside. A more sophisticated component might string several of those together, add a heading banner and some icons. The HTML required was getting complex and fragile, so that any change to the component would require the consumer to update the HTML to match. Icons were inlined, so changing an icon would mean editing a large blob of SVG.

In semantic versioning terms, every change became a major. While integers are free, time to check what’s needed to update isn’t, so this wasn’t going to be a scalable approach.

<div class="a-component"> 
<div class="media-block">
<div class="media-left">
<svg> ... icon here </svg> ...
</div>
</div>
</div>

becomes

{{fromPackage "@a-team/pity-the-media" .}}

We started moving the handlebars templates we use that have the HTML to invoke a component on our website into the modules. This moved the interface boundary into something more stable. Now we can change what that component needs and does as the design evolves without having to go propagate those changes to an unknown number of callers.

You’ll remember I mentioned SVG icons. It turns out that inlining small icons is one of the most efficient ways to use them, but it doesn’t scale very well in the development process. The alternatives, icon fonts, require a lot of infrastructure and are brittle enough that it stifles the act of moving things into a module. Icons have to be in large groups with that approach, and that trends toward very large modules, and probably to less efficient ways to do things.

What I ended up doing was making a small handlebars helper like the fromPackage helper I just showed the call to, and made a couple helpers for loading SVG from packages. Called from our handlebars templates, a single helper invocation can load and parse SVG from a package, do simple optimizations, and cache the result, and inline it. SVGs, too, then, became modules we can publish separately or in small groups.

A bit of an aside:

React Changed Everything

There is a reason that React talks have been so popular for the last couple years. It really did make a radical shift in how we design frameworks, and more importantly to me, it helped give components better interfaces. Stateless components have well defined inputs and outputs. Side effects are reduced or eliminated. This means modules can more easily declare their dependencies and give simple interfaces.

This also means that React components fit into packages really neatly, and automatically give an interface that’s like a function call. If you’re a react programmer, you’ll probably recognize my fromPackage helper as very similar to node’s require, which is how most of us use React these days, as webpacked or browerified modules.

What can we steal from React?

That modularity and clear boundary on interfaces changed so much. Let’s re-think how we integrate things to have interfaces that simple and clean. There’s been a lot of experiments, too, on having react components automatically namespace CSS they require, and then emitting HTML that uses the namespaced version. By moving the module boundary from the raw CSS to something that gets called, an active process, CSS namespacing woes can be solved by separating what the humans type from what the browser interprets a little bit.

How radically can we change the complexity of an API by changing what kind of thing we export?

What else can be a module?

At PayPal, I did work to make translations be separately loadable things, which leads rapidly into separating those pieces into entirely separate packages with their own maintenance lifecycle. When you have a separate team working on something, having a clean boundary can be a great way to let work progress at a more independent pace. What else can we modularize?

That last one is really interesting. Kyle Mitchell is a lawyer who uses a lot of software in his work to draft legal texts. In so doing, he’s published a lot of tiny modules of interesting stuff. Mostly they’re JSON files, cleanly licensed and versioned, or small tools for assembling legal language out of smaller pieces, re-using tested and tried phrasings of things. Sounds familiar, right?

Text itself can be a module with an interface, even if that interface is concatenation of a bit of text.

We can even make modules that are nothing but known good configurations of other modules, combined and tested.

Making Modularity

Hands on!

A lot of this is going to be specific to node, but I like node not just because it’s JavaScript and I think JavaScript is a lot of fun, but because its dependency model is actually pretty unique. That’s actually a lot of what drove me to node in the first place.

The underappreciated feature of node modules is that they nest — this really bugs windows users since their tools can’t deal with deep paths — and this means that we can have a module get a known version of something, defined entirely on its terms, which means that what a package depends on is either a less-important part of or not even a part of the interface to a module. We spend a lot less time wrangling which versions of things all have to be available at once, and we can start putting dependencies behind the boundary that a module defines.

Most of what I build is built on @substack’s resolve module

resolve.sync('@mad-science/luncheon-meats/baloney.svg')

give me the file baloney.svg from the @mad-science/luncheon-meats package

This is a really simple module that implements node’s module lookup strategy for files that aren’t javascript. You can say “give me the file baloney.svg from the @mad-science/luncheon-meats package”, and it will find it, no matter where it got installed into the tree — remember node modules let you share implementations if two things require compatible versions of a module — and we name the file, in this case the actual interface of this hypothetical module is to just read the file once you figure out where it is.

That’s our primitive building block. I like this one because it matches how everything else in the runtime I use most works.

There’s another thing that’s common to do: Add a new field to package.json. Dr. Frankenstyle uses the property style rather than main to say which file is the entry point to the package. This means that modules can do dual-duty: grouping different aspects of a thing together into a single component, rather than making the caller assemble the pieces when the pieces all go together anyway.

One of the things I ask when building interfaces in general, and module systems in particular is “how many guarantees can we make?”

  • Dependencies isolated
  • Deduplication
  • Local paths are relative to the file
  • Single entry point

One of the guarantees I love most is that local paths in node modules are relative to the file. This is one of the ways that make it possible to break things into modules without breaking down the interface they had as a monolithic unit. It really makes me sad that most templating languages don’t maintain filenames deep enough into their internals to implement this. It’s good for source mapping and it’s good for modularity.

A lot of people fought this – they keep fighting this in node development, but I think it exposes how people think about modularity: This is a symptom of making the whole project a single module and giving the components just enough name to navigate but not enough that they can live on their own.

I keep building similar models. I often make a path rewrites for resources, so that things relative to the file where it’s actually stored will work when loaded into a new context where it’s used. Sometimes that’s inlining. Sometimes that’s copying modules into a destination and making sure their assets come along for the ride.

This is replicating the guarantees of node’s module system, because they give me some flexibility and durability in what I make. If things have their own namespaces, their own dependencies, then I can break them less often or not at all.

Going Meta

If everything’s a module, what can we do with that?

Have we simplified things enough to start giving our programs the vocabulary to start extending themselves? Can we start talking about constructing programs out of larger building blocks, even if they’re sometimes special purpose?

Can modules or remotely loaded packages be first-class objects in our programs?

What about generating new modules in the course of using our programs, and letting our users share them?

What other kinds of interfaces can we give. Web services? Data sets with guarantees about how they’ll develop in the future?

How radically can we simplify the interface of something?

One of the most influential concepts in my career was that phrase a lot of us have heard about UNIX: everything is a file. Now, that’s a damn lie. There’s a lot of things in unix that aren’t files at all. IPC. System calls. Locks. Lots of things can be file descriptors, like sockets, but if you want to see more things shoehorned into that interface, you could go install Plan 9, but there’s not very much software out there for Plan 9.

Even so, UNIX took off in a huge way thanks to a bunch of factors, and even Plan 9 and Inferno and systems derived from it have this really outsized longevity in our minds because of one thing: They simplfied their interfaces. Radically.

They defined their interfaces so simply that you can sum them up in a few words.

“Text file. Delimited by colons.”

“Line delimited log entries.”

“Just a plain file, a sequence of bytes”

These are super durable primitives. They had all their edge cases shaved off. No record sizing to write a file no special knowledge of what bytes were allowed or not. Very few things imbued with special meaning.

This means these systems last because they give us building blocks to build better things out of.

I love to pick on unix because for all its ancient cruft there’s an elegant system inside. It’s not the only super simple interface that really took off either.

Chris Neukirchen made the rack library for ruby, and little did they know but it suddenly got adopted by all the frameworks and all the servers because at the core of it, a web request got simplified down to a single function call: environment in, response code, headers, and body out. It’s adapted from the Python WSGI but it was a great distillation of the concepts.

node modules also have this ridiculously simple interface. They get wrapped in a function with five parameters, and are provided a place to put their exports and a require function for their own use. It turned out to be a great thing to build even more complicated module systems out of.

node streams, too. By making them pretty generic, it turns out that thousands and thousands of packages all use the shared interface and all work together.

It’s really worthwhile asking yourself if there’s a radically simplified, generic interface that your module is begging for.

Tnank you.

git commit messages

My current thoughts on commit messages.

First, we had change annotations, as descriptions of what changed:

fixed bug in display code

or

improved caching behavior for edge case

My first objection to these is that commits are not always past tense. In a world of CVS and Subversion, they are: reworking and recommitting things is far too much work, but this is git. They are not just a record of what we did, but they are actual objects that we are going to talk about, they are proposals and often they are speculative. git is an editor.

It doesn’t feel particularly natural to be more descriptive here because we’re basically adding labels to a timeline. If we do get descriptive here, it’ll be as sentence fragments awkwardly broken up into bullet lists at best, and talking more about what we did than why we did it. Let’s talk about them in the present tense:

fixes bug in display code

case where display list is null

or

improves caching behavior for edge case

sometimes we write the empty entry first

A step in the right direction. Those start looking like objects we are going to talk about. However, they don’t make a lot of sense without context. Commits come with only two pieces of context: their parent commit, and the tree state they refer to.

These messages assume context in a way that leads to spelunking in the history later will not necessarily find. fixes bug implies there was a bug to fix, but not much about it. We still are talking more about history than about what we changed. One has to compare the states before and after, and there’s not a lot of incentive in this format to continue and describe the bug. The context is assumed. In talking about these commits, we’d say things like “this commit deadbeef was the problem”. We don’t really refer to the commit so much as the state it brings, and even then only weakly, in the form of what’s different about that state from previous, not what it is.

We can describe a little more but we’re still describing what we’re doing and not the state of the world.

In a world where we may rebase them, move them around and combine them, something a little more durable needs to happen. Let’s treat commit titles as names.

fix for bug in display code

a replacement handler for case with empty display list causing corruption
of the viewport

or

improvement in caching behavior for edge case

a check to skip writing empty entries in the cache, preventing the case
where empty entries would be returned instead of a cache miss.

Now the description we’ve left out starts feeling obvious. Now I want to know more about this bug, I want to know more about the fix, and I want to know about this improvement. These are nouns, and we have a lot of language for describing nouns.

These make sense even if rebased, and if we were to read the source code associated with this change, we would find that this describes the code added and removed, not the change from some unknown previous state. We know almost everything about the contents of this commit without having to infer it from context, and discussing it as the actual code becomes much easier. Code reviews can be improved, and we can refer to these commit hashes (or URLs) as objects and refer to them meaningfully later. “This improvement was very good”, or “this improvement introduced a bug”

Now we have objects to talk about, and detail about the state that differentiates it from other states, even without being directly attached to the history. With the need for context reduced, we can now use these commit messages in new contexts without rewording them. We add some tags with some machine-readable semantics: Tools like conventional-changelog-cli can generate change logs for summary to a user and semantic-release can bump version numbers in meaningful ways, dependent on the changes being released. We’ve pushed that decision out to the edges of the system, where all the context for doing it right lives. The result:

fix: bug in display code

a replacement handler for case with empty display list causing corruption
of the viewport.

and

fix: improvement in caching behavior for edge case

a check to skip writing empty entries in the cache, preventing the case
where empty entries would be returned instead of a cache miss.

BREAKING CHANGE: empty cache entries are not saved so negative caching
must be handled in another layer.

And in changelog format:

v2.0.0 (2016-04-16)

  • fix: bug in display code 886a50c
  • fix: improvement in caching behavior for edge case 9bce4c5

BREAKING CHANGE

  • empty cache entries are not saved so negative caching must be handled in another layer.

This is super useful, but I think the context reducing style of commit message is a good prerequisite for actually getting good change logs that make sense.

A side note. I think github’s new squash and merge feature is going to be the perfect place for this style: individual commits are often not quite the right granularity for tagging. The style notes here apply otherwise, but tags I think are most useful on a merge-by-merge basis.

In the absence of squashing, a change to conventional-changelog that only looked at merge commits would be excellent, leaving the small state changes visible for code review, but the merges visible as external changes in the log.

Why MVC doesn't fit the browser

In part one of this series I talk about Why MVC doesn’t fit the web from the point of view of writing web services, in the vein of Ruby on Rails and Express. This time I’m continuing that rant aimed at the modern GUI: The Browser.

MVC originated from the same systems research that gave rise to Smalltalk, which then had ideas imported into Ruby and Objective C that we use today. The first mention of an MVC pattern that I’m aware of was part of the original specifications for the Dynabook – a vision that has still not been realized in full, but that laid out a fairly complete vision for what personal computing could look like, a system that any user can modify and adjust. The software industry owes a great deal to some of this visionary work, and many concepts we take for granted today like object oriented programming came out of this research and proposal.

The biggest part of the organizational pattern is that the model is the ‘pure ideal’ of the thing at hand – one of the canonical examples is a CAD model for an engineering drawing: the model represents the part in terms of inches and parts and engineering terms, not pixels or voxels or more specific representations used for display

The View classes read that model and display it. Its major components are in terms of windows and displays and pixels or the actual primitives used to display the model. In that canonical CAD application, a view would be a rendered view, whether wire-frame or shaded or parts list data displayed from that model.

The way the two talk is usually that the model emits an event saying that it changed, and the view re-reads and re-displays. This lives on today in systems like React, where the pure model, the ‘state’, when it updates, triggers the view to redraw. It’s a very good pattern, and the directed flow from model to view really helps keep the design of the system from turning into a synchronization problem.

In a 1980’s CAD app, you might have a command-line that tells the model to add a part, or maybe a mouse operating some pretty limited widgets on screen, usually separate from the view window. Where there is interaction directly on the view, the controller might look up in the view what part of the model got clicked, but it’s very thin interface.

That’s classic MVC.

To sum up: separate the model logic that operates in terms of the business domain, the actual point of the system, and don’t tie it to the specifics of the view system. This leaves you with a flexible design where adding features later that interpret that information differently is less difficult – imagine adding printing or pen plotting to that CAD application if it were stored only as render buffers!

Last we come to controllers. Controllers are the trickiest part, because We Don’t Do That Anymore. There are vestigial bits of a pure controller in some web frameworks, and certainly inside the browser. Individual elements like an input or text area are most recognizable. The model is a simple string: the contents of the field. The view is the binding to the display, the render buffers and text rendering; the controller is the input binding – while the field has focus, any keyboard input can be directed through something that is written much like a classic controller, and updates the model at the position in the associated state. In systems dealing with detached, not-on-screen hardware input devices, there’s certainly a component that directs input into the system. We see this with game controllers, and even the virtual controllers on-screen on phones emulate this model, since the input is usually somewhat detached from the view.

In modern web frameworks, you’ll find a recognizable model in most if not all. Backbone did this, giving a structured base class to work from, since it is commonly mapped to a REST API in the form of its Backbone.Model class. Angular does this with the service layer, a pretty structured approach to “model”. In a great many systems, the model is the ‘everything else’, the actual system that you’re building a view on top of.

Views are usually templates, but often have binding code, read from the model, format it, make some DOM elements (using the template) and substitute it in, or do virtual DOM update tricks like React does. Backbone.View is an actual class that can render templates or do any other DOM munging to display its model, and can bind to change events in a Backbone.Model; React components, too, are very much like the classic MVC View, in that they react to model or state updates to propagate their display adaptation out to the viewer.

The major difference from MVC comes in event handling. The DOM, in the large, is deeply unfriendly to the concept of a controller. We have a lot of systems that vaguely resemble one if you squint right: navigation control input and initial state from the URL in a router; key bindings often look a lot like a controller. To make a classic MVC Controller, though, input would have to be routed to a central component that then updates models and configures views; this split rarely exists cleanly in practice, and we end up with event handlers all directly modifying model properties, which reflect their state outward into views and templates.

We could wrap and layer things sufficiently to make such a system, but in the guise of ideological purity, we would have lost any simplicity our system had to begin with, and in the case of browsers and the web, we would be completely ivorced from native browser behavior, reinventing everything, and losing any ability to gracefully degrade without javascript.

We need – and have started to create – new patterns. Model-View-ViewModel, Flux, Redux, routers, and functional-reactive approaches are all great ways to consider structuring new applications. We’re deeply integrating interactivity, elements and controls are not just clickable and controllable with a keyboard, but with touch input, pen input, eye-tracking and gesture input. It’s time to keep a critical eye on the patterns we develop and continue to have the conversations about what patterns suit what applications.

some background on learning how to speak network protocols in node.js by implementing http

Yesterday I wrote a small http client library starting from bare node.js net module to make a TCP connection, and in a series of steps built up a working client for a very basic dialect of HTTP/1.1. It’s built to have some similar design decisions to node’s core http module, just for real-world relatedness.

I did this as a teaching exercise for a friend – he watched me work it up in a shared terminal window – but I think it’s interesting as a learning example.

For background, this relies on core concepts from node.js: streams, and connecting to a TCP port with the net module; I’m not doing anything complicated with net, and almost entirely ignoring errors.

I start almost every project with a similar start: git init http-from-first-principles then cd !$ then npm init and mostly accept the defaults. I usually set up the test script as a simple tap test.js. After the init, I run npm install --save-dev tap. tap is my favorite test framework, because it has pretty excellent diagnostics when a test fails, runs each test file in a separate process so they can’t interfere with each other so easily, and because you can just run the test file itself as a plain node script and get reasonable output. There’s no magic in it. (the TAP protocol is pretty neat, too)

Next, I created just enough to send a request and a test for it. The actual HTTP protocol is simple. Not as simple as it once was, but here’s the essence of it:

GET /a-thing-i-want HTTP/1.1
Host: example.org
Accept: text/html

That’s enough to fetch a page called /a-thing-i-want from the server at example.org. That’s the equivalent of http://example.org/a-thing-i-want in the browser. There’s a lot more that could be added to a request – browsers add the user-agent string, what language you prefer, and all kinds of information. I’ve added the Accept: header to this demo, which is what we’d send if we want to suggest that the server should send us HTML.

The server will respond in kind:

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 35

<p>this is the page you wanted</p>

That may not come in all at once – the Internet is full of weird computers that are low on memory, and networks that can only send a bit at a time. Since node.js has streams, we get things as they come in, and we have to assemble the pieces ourselves. They come in in order, though, so it’s not too hard. It does complicate making a protocol handler like this, but it does give us the chance to make a very low-memory, efficient processor for HTTP messages.

So if we get that back all as a chunk as users of our HTTP library, that’s not that useful – nobody wants to see the raw HTTP headers splattered at the top of every web page. (Okay, I might. But only because I love seeing under the hood. It’d be like having a transparent lock so you can see the workings. Not so great for everyday use.)

A better interface would be to have headers come out as a javascript object, separate from the body. That’s what is done in the next commit. Or at least the interface is exposed – we don’t actually parse the headers yet. That’s going to be trickier.

What we have to do to is read off the pieces a bit at a time and do the work needed to break the header up into lines.

There are several cases that might happen:

  • We got a part of a header line
  • We got a complete header line and part of another
  • We got a complete header line and nothing else
  • We got a complete header line, and the newline that ends the header section
  • We got a complete response all at once
  • We got a complete header, and part of the body
  • Having already received a part of a header, we get another part of a header
  • Having already received a part of a header, we get the remainder and more…

And so on. There’s a lot of ways things can and will be broken up depending on where the packet boundaries fall in the stuff we care about. We have to handle it all.

The best approach is starting at the beginning, and see if you have a complete thing. If not, store it for later and wait for more. If you do have a complete thing, process it, take that chunk off the beginning of what you’re processing, and loop this process and see if there’s more. Repeat until complete or you have an error. That’s what the first part of the header parser does.

That first pass at the problem was a little naive and doesn’t stop at the end of the header properly. So next we put in a temporary hack to put that missing chunk somewhere.

Next, we have to make sure that the body is passed on separately from the headers. Then we remove the temporary hack.

So we’ve got the headers stored as a list, which is great. An object would be better, so we can access them by name. let’s postprocess the headers into an object.

xhyve and Alpine Linux

A quick howto to get Alpine Linux running under xhyve on a Mac:

Install cdrtools to get isoinfo:

brew install cdrtools

Extract the kernel and initramfs from the ISO:

isoinfo -i alpine-3.2.3-x86_64.iso -J -x /boot/initramfs-grsec > initramfs-grsec
isoinfo -i alpine-3.2.3-x86_64.iso -J -x /boot/vmlinuz-grsec > vmlinux-grsec

Then boot the system:

xhyve -f kexec,vmlinux-grsec,initramfs-grsec,"alpine_dev=cdrom:iso9660 modules=loop,squashfs,sd-mod,usb-storage,sr-mod,console quiet console=ttyS0" -m 1G  -l com1,stdio -s 0:0,hostbridge -s 31,lpc -s 3,ahci-cd,alpine-3.2.3-x86_64.iso

Add -s 2:0,virtio-net in there if you want networking, but that means you need to be root to run xhyve.

A week at npm

I finished my first week at npm today.

It’s everything I’d hoped it would be – I have excellent coworkers. The work is interesting. The business makes sense. The plans for the future are exciting, but not mind-bendingly ambitious.

Working on open source is familiar, and the workflow of using private repositories on GitHub is definitely smoother than bouncing between two separate instances using GitHub Enterprise. I think this is overlooked by corporate systems designers and security folks. Trading this ease out costs a lot in productivty and maintenance that I think is under-appreciated.

Being remote is imperfect. The tools for remote face to face meetings leave something to be desired, and doubly so with some hearing damage that makes it hard to understand words if there’s any interference or background noise. There’s still a lot of room out there for someone to get multi-party video conferencing right. Being remote from an office that has a majority of my coworkers colocated has some downsides, but my team and the company as a whole is gracious and thoughtful and caring, and that smooths over the vast majority of the rough edges.

The biggest difference is how much more processes make sense when everyone is involved and cares. So far, every decision has made sense, and it’s getting easier to trust that things are the way they are for a reason, and if they cause a problem can be changed. In comparison to a corporate bureaucracy who only occasionally manages to challenge its tendency to ossify, it’s a world of difference – without a tyrrany of structurelessness. In so many ways, npm is a traditionally structured company. A simple heirarchy of managers and reporting. Employees doing the work have the most visibility into that work, the executives have the most comprehensive ability to steer and direct, but rely on us for the insight into the details. No special organization to teams – grouped by project, people allocated according to company goals. All of this though, has an element of trust that I’ve not seen since I worked at Wondermill in 2001. People genuinely like each other, support each other, and go out of their way to make sure things work for each other. In so many ways: it feels like working with a net. A proper safety net, not something rigged up to be good enough at the moment but precarious to trust long term.

A simple approach to deploying with git without clutter

Today, I created git-create-deploy-branch after kicking some of the ideas around for a couple years.

Git at first seems to be an ideal tool for deploying web sites and other things that don’t have object code. However, it’s never been that simple, and where there’s programming, there’s automating the tedious bits and creating derivative pieces from more humane sources.

With the addition of receive.denyCurrentBranch = updateInstead in git 2.3.0, possibilities opened up for really reliable, simple workflows. They’ve since been refined, with a push-to-checkout hook allowing built objects to be created on the receiving server, but I want a more verifiable, local approach.

There are two main strategies in git for dealing with this, and before git 2.3.0, those were really the only things available. In the first, git holds only the source material, and any built products are managed outside of git, whether as a directory of numbered tarballs or in a service meant for such things. Some services like the npm registry bring a lot of value, with public access and hosting and replication available; some are little more than object storage like Amazon S3. In the second approach, built products are committed back, and git becomes a dumb content tracker – conflicts in built files are resolved by regenerating them from merged source material, and the build process becomes integral to every operation on the tree of files.

I’ve long wanted a third way, using the branching, fast, and stable infrastructure of git, while keeping the strict separation of source material and built material. I want to be able to inspect what will be deployed, and inspect the differences between what was deployed each time, and separately, analyze the changes to the source material, yet still be able to relate it to the deployed, built objects. To that end, this tool can be considered a first attempt at building tools that understand the idea of a branch derived from another.

The design is simple enough: given a branch (say master) checked out in your repository, with a build process for whatever objects need to exist in the final form, but those products ignored by a .gitignore file, like so:

source.txt:

aGVsbG8sIHdvcmxkCg==

and a build script:

build.sh:

#!/bin/sh

base64 -D < source.txt > built.txt

and an ignore file, with both the built object and other things like editor cruft:

.gitignore:

built.txt
*.swp
*~

we create a file listing the files to skip excluding when creating the derived branch, like so:

.gitdeploy:

built.txt

The initial version of the tool is very simple, and doesn’t support wildcards or any other features of any complexity in the .gitdeploy file. This is not out of a strong opinion, but as a matter of implementation simplicity, given that my prototype is written using bash.

You can install it with npm:

npm install -g git-create-deploy-branch

To create the deploy branch, we’ll run the build, then create the deploy branch with those objects present in our working directory:

./build.sh && git create-deploy-branch

Our first run gives output like so:

[new branch] 8acba8787306 deploy/master

and a branch deploy/master is created, in this case with commit ID 8acba8787306. We can show that it includes the built files:

:; git show deploy/master
commit 8acba87873062dd8b4fc516bab581a450bf9e077
Author: Aria Stewart <aredridel@nbtsc.org>
Date: Sat Aug 8 22:30:05 2015

deploy master

diff --git built.txt built.txt
new file mode 100644
index 000000000000..4b5fa63702dd
--- /dev/null
+++ built.txt
@@ -0,0 +1 @@
+hello, world

The commit also has the parent commit set to the current commit on master, so we can track the divergence between master and deploy/master, both expected (with the built objects) and unexpected (errant commits made on the deploy branch).

Let’s update our source, and commit that:

source.txt:

aGVsbG8sIHdvcmxkOiB3ZSBoYXZlIGNhbmR5Cg==

The repository now looks something like this:

:; git graph master deploy/master
* e391cd8deb5e - (HEAD -> master) New source (8 seconds ago) <Aria Stewart>
| * 8acba8787306 - (deploy/master) deploy master (5 minutes ago) <Aria Stewart>
|/
* 8045ecf53520 - Add .gitdeploy (5 minutes ago) <Aria Stewart>
* 0a347a1892a6 - initial commit (5 minutes ago) <Aria Stewart>

And if we run the build and deploy again:

./build.sh && git create-deploy-branch

We get output like so:

8acba8787306..16663a3ae945 deploy/master

And our repository now includes a new merge commit, showing the origin of the deployed objects, and the prior deploy:

*   16663a3ae945 - (deploy/master) deploy master (68 seconds ago) <Aria Stewart>
|\
* | e391cd8deb5e - (HEAD -> master) New source (3 minutes ago) <Aria Stewart>
| * 8acba8787306 - deploy master (7 minutes ago) <Aria Stewart>
|/
* 8045ecf53520 - Add .gitdeploy (7 minutes ago) <Aria Stewart>
* 0a347a1892a6 - initial commit (8 minutes ago) <Aria Stewart>

On a remote machine, let’s create a deploy repository, set it up to receive our deploys, and add it as a remote for us.

ssh remotemachine 'git init show-off-build && cd show-off-build && git config receive.denyCurrentBranch updateInstead && git checkout -b deploy/master'

git remote add remotemachine ssh://remotemachine/~/show-off-build

Now we can deploy this with a simple command:

git push remotemachine deploy/master

So in total, deploying a new derivative of our source code consists of making our changes and committing them, then running the build and the command to create the deploy branch, then pushing:

git commit -m changes
./build.sh && git create-deploy-branch && git push remotemachine deploy/master

Stable, traceable, reliable, replicatable builds and deploys, stored in git but not cluttering the source branch.

Let’s see our handiwork:

ssh remotemachine cat show-off-build/built.txt

And the response?

hello, world: we have candy

Debugging double-callback bugs in node.js

One of the most frustrating things that happens in a large node.js application is a double callback bug. They’re usually simple mistakes that are super tricky to track down. You may have seen one and not recognized it as such. In Express, one manifestation is Error: Can't set headers after they are sent; another one I’ve seen is an EventEmitter with an error event handler registered with ee.once('error', handler) that crashes the process saying it has an unhandled error – the first callback fires the error handler, the second triggers another error and since it was bound with once, it crashes. Sometimes they’re heisenbugs, where one path through a race condition resolves successfully, but another will manifest a crash or strange behavior.

The causes can be simple – here’s one:

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
cb(err);
}

cb(null, JSON.parse(data));
});
}

Can you spot it?

The error callback doesn’t end the function.

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

cb(null, JSON.parse(data));
});
}

This version works more acceptably if fs.readFile gives us an error. Now let’s consider what happens when there’s a JSON parse error: This crashes, since an exception thrown by JSON.parse will unwind up the stack back to fs.readFile‘s handler in the event loop, which has no try/catch and will crash your process with an uncaughtException. Let’s add an exception handler.

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
cb(null, JSON.parse(data));
} catch (e) {
cb(e);
}
});
}

Yay! That way if the JSON fails to parse, we’ll get the error in the callback. Nice and tidy, right?

Not so fast. What if cb throws an exception, like in this calling code:

readJsonAsync(function (err, json) {
if (err) {
return console.warn("Fail!", err);
}

console.log("Success! Got all kinds of excitement! Check this out!");
console.log(json.exciting.thing.that.does.not.exist);
});

Whoops. That last line throws TypeError: Cannot read property 'thing' of undefined.

That goes back to the callback function and the try/catch block, and we call back again with the error. Our callback gets called twice – which isn’t so bad with things that don’t care like console.log and console.warn, but even then, the output is confusing:

Success! Got all kinds of excitement! Check this out!
Fail! TypeError: Cannot read property 'thing' of undefined

It both worked and didn’t work! That’d crash our program if something throws an exception for a double callback. It’ll eat the error and we’d wonder why our program was misbehaving if the thing we’re calling ignored second callbacks.

We’ve also made a tricky conundrum here. There’s a lot of ways to solve it, from the ignoring multiple callbacks like so: (this example uses the once module)

var once = require('once');

function readJsonAsync(cb) {
cb = once(cb);
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
cb(null, JSON.parse(data));
} catch (e) {
cb(e);
}
});
}

to the crashing more obviously because we just don’t handle the exception, like so:

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

var parsed;
try {
parsed = JSON.parse(data);
} catch (e) {
return cb(e);
}

cb(null, parsed);
});
}

or one where we use setImmediate (or more tidily, check out the dezalgo package or the async package’s async.ensureAsync):

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
var parsed = JSON.parse(data);
setImmediate(function () {
cb(null, parsed);
});
} catch (e) {
setImmediate(function () {
cb(e);
});
}
});
}

This means that the caller of readJsonAsync is on their own to handle their exceptions. No warranties, if it breaks, they get to keep both pieces, et cetera. But there’s no double callbacks!

So this gets tricky when you have a whole chain of things – someone’s made a mistake in something “so simple it can’t go wrong!” like a readFile callback that parses JSON, but the double callback comes out miles away, in a callback to something in a callback to something in a callback to something in a callback that calls readJsonAsync. This isn’t an uncommon scenario – every Express middleware is a callback, every call to next calls another. Every composed callback-calling function is another layer. The distance can get pretty severe sometimes. This is one of the less-loved benefits of promises: errors are much more isolated there, and the error passing is much more explicit. I think it’s a more important point than a lot of things about promises. But that’s neither here nor there. What we’re asking is:

How do we debug doubled callbacks?!

My favorite way is to write a function that will track a double callback and log the stack trace of both paths. This is a bit like the once package, but with error logging.

Here’s a simple version.

function justOnceLogIfTwice(cb) {
var last;
return function () { // return a new function wrapping the old one.
if (!last) {
last = new Error(); // Save this for later in case we need it.
cb.apply(this, arguments); // Call the original callback
} else {
var thisTime = new Error("Called twice!");
console.warn("Callback called twice! The first time is", last.stack, "and the next time is", thisTime.stack);
// optionally, we might crash the program here if we want to be loud about errors. Like so:
setImmediate(function () {
// This is an "async throw" -- it can only be caught by error domains or the `uncaughtException` event on `process`.
throw thisTime;
});
}
};
}

We can then wrap our callbacks in it:

function readJsonAsync(cb) {
cb = justOnceLogIfTwice(cb);
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
cb(null, JSON.parse(data));
} catch (e) {
cb(e);
}
});
}

Now we just have to trigger the error, and we should get two stack traces, once with the success path, and once with the error path.

Other ways? Set breakpoints on the calls to cb. See what the program state is at each of them.

Try to make a reproduction case. Good luck: it’s hard.

Add once wrappers to callbacks until you find the problem. Move them deeper and deeper until you find the actual source.

Give extra scrutiny to non-obvious error paths. If you can’t spot where errors go, I’d bet money on finding part or all of the bug in there.

Add an async-tracking stack trace module like long-stack-trace or longjohn. They slow your program down and can change the behavior because of the tricks they do to get long traces, but they can be invaluable if they don’t disturb things too much.

Consider using this eslint rule to catch the simpler cases – it won’t catch all of them, but it’ll at least catch the missing return case.

Good luck!

To leave one amazing team for another

This is part announcement, part job advertisement, part musing on what it’s like to work with a really amazing team.

I’m leaving PayPal in the first week of August to join the fine people at npm, inc as the architect of the web site. It was actually one of the toughest decisions I’ve had to make, because while npm is the company I absolutely most want to work for, I really, really like my team at PayPal. I can’t think of any other company I’d leave my team for. They are kind, hard-working, honest, visionary but not obnoxiously opinionated. I’ve been given a huge amount of trust while I was there, and I’ve produced some great work. As one of my last acts for the team, I want to find someone to replace me.

For the past year, I’ve been working on KrakenJS at PayPal, doing largely open source development, and supporting application teams internally. The Kraken team is a unique team in a unique spot in the company. Our job is the open source project, advocacy for our internal developers, technological leadership, and creating common infrastructure when we can identify problems that multiple teams have. We do research and experiment with new technologies – both to vet them for stability, and to find places that will be error-prone and require caution or will impact long term maintenance.

I spent most of my year working on internationalization components. This wasn’t exactly assigned work – though someone really did need to do that work, so I jumped in and did it – but there’s a lot of things that need attention and the point of the project is to serve its users needs. It’s not there to enforce an opinion, just to solve problems, and so it does and we do. The team has worked a lot on rough consensus and running code. If someone has an idea, they prototype it and show it off to the team. Ownership is collective, but everyone takes responsibility.

Originally, Kraken was a prototyping tool used internally. The original team was taking a rough stab at some early componentizing and tooling for purely front-end work, but as time passed, the real niche showed up: an enterprise-friendly, structured but not too restrictive framework for making front-end services, first as a prototype for Java services that were not yet ready, and later, to replace those services with a node.js-based front tier. Application teams are now integrated, full-stack teams, building both in-browser and server-side components together. This has allowed a pretty unprecedented pace of development within PayPal, and in the past two and a half years, nearly every customer-facing application has been rewritten. That’s a huge amount of success enabled by the experimentation and resourcefulness of this small team. There are recordings of conference talks about this.

Recently, the team has been merged with some of the core node.js infrastructure team, now responsible for both internal architecture modules and the open source project. While the split loyalties to open source and to the internal company work are annoying, it actually works really well that way. PayPal is credibly the single largest enterprise use of node.js. I think we’ve got more developers using it than any other company, and certainly have based a large portion of our architecture on it. If someone’s having a problem with node, chances are we’ve seen the error and may well have found patterns or workarounds for development problems, and we work on getting bugs fixed upstream.

An example of one of the trickier bugs was diagnostics of a memory leak in io.js. You can see the back-and-forth with Fedor Indutny and my team on that issue, trying to diagnose what’s going on. Credit to Fedor: he knows the source of io.js better than anyone I know, particularly the TLS parts, and made tidy work of fixing it, but instrumenting, diagnosing and tracing that leak was a weeks-long process, starting in-house with monitoring noticing that a service running iojs behaved differently than the version running node 0.10 or 0.12. From there, making diagnostic framework to track what’s going on, and really digging in let us make a bug report of this caliber. Not every – or even many bugs involve that kind of to-the-metal investigation, but the team can figure out anything. They are great, kind, wonderful people.

It’s not all roses. There’s a lot of legacy baggage within the company, as any company that size and age is going to have. Enterprise constraints and organization have their own weight. Some people are resistant to change, and not every developer wants to do an amazing job in the company. Moving to new technologies and ways of doing things still require backward compatibility and migration paths, but having tools like semantic versioning and node.js’s module structure have helped a lot. Tools like Github Enterprise, Asana and Slack and HipChat have their roles in enabling this kind of change.

My workday at PayPal goes something like this:

  • An hour of technical reading – maybe about babel or one of Brendan Gregg’s performance blog posts or one of Thorsten Lorenz’s blog posts or internals of node.
  • Follow up on application crash emails – perhaps chase down a team or two who’s ignoring or doesn’t know their app is crashing, and help diagnose what’s really going on and try to get it fixed.
  • Review pull requests and issues on the Kraken open source project and its modules. It’s not an overwhelming pace, but there’s something most mornings.
  • Work for a couple hours on the modules of Kraken or internal infrastructure integration that most need it.
  • Answer internal support email about node.js and guide developers internally on how to avoid problems.
  • Maybe do a code review of an internal application, and give feedback about problems they’re likely to run into.
  • Advocate for improvements to internal infrastructure.
  • Help people on IRC between things.

In addition, I’ve spoken at several conferences, some of which PayPal has sponsored, some independently. It’s been intense but a very good experience.

It’s been a great honor to work with these fine people. Given the chance to again, there are not many places I would choose over them.

Port numbers and URLs

Today someone asked on the node.js mailing list why the URL that Express.js gave them to access their application had a port number in it, and if they could get rid of it (since other sites don’t have it.)

My explanation is this:

There are some interesting details to this!

Each service on the Internet has a port assigned to it by a group called IANA. http is port 80, ssh is 22, https is 443, xmpp is 5222 (and a few others, because it’s complicated), pop3 is 110 and imap is 143. If the service is running on its normal port, things don’t usually need to know the port because it can just assume the usual one. In http URLs, this lets us leave the port number out – http://example.org/ and http://example.org:80/ in theory identify the same thing. Some systems treat them as ‘different’ when comparing, but they access the same resource.

Now if you’re not on the default port, you have to specify – so Express apps in particular suggest you access http://localhost:8080/ (or 3000 – there’s a couple common ports for “this is an app fresh off of a generator, customize from here”). This is actually just a hint – usually they listen to more than localhost, and the report back for the URL is actually not very robust, but it works enough to get people off the ground while they learn to write web services.

If you run your app on port 80, you won’t need that.

However!

Unix systems restrict ports under 1024 as reserved for the system – a simple enough restriction to keep a user from starting up something in place of a system service at startup time, in the era of shared systems. That means you have to run something as root to bind port 80, unless you use special tools. There’s one called authbind that lets you bind a privileged port (found most commonly on Debian-derived Linuxes), one can call process.setuid and process.setgid to relinquish root privilege after binding (a common tactic in classic unix systems), though there’s some fiddly details there that could leave you exposed if someone manages to inject executable code into what you’re running. And finally, one can proxy from a ‘trusted’ system daemon to your app on some arbitrary port – nginx is a popular choice for this, as are haproxy, stunnel and others.

Now as to why it’s just a hint: the problem of an app figuring out its own URL(s) is actually very hard, unsolvable often even in simple cases, given the myriad of things we do to networking – NAT and proxies in particular confuse this – and that there’s no requirement to be able to look up a hostname for an IP address, even if the hostname can be looked up to get the IP address. None of this matters for localhost though, which has a nice known name and a nice known IP and most people do development on their own computers, and so we can hand-wave all this complexity away until later, after someone has something up and running.

Temporal Coupling is bad

In reviewing the source to express.js I came across a reasonably compact example of temporal coupling.

This is badly factored, and I’ll lay out why:

Temporal coupling is the reliance on a certain sequence of calls or checks to function, rather than having them explicitly called in order in a function. “this, then this, then this have to be called before the state you look at here will be present” is how it works out.

the bits of application.js that call the view are the start of it – the view could be there! Or not! Make one maybe!

if (!view) {
view = new (this.get('view'))(name, {
defaultEngine: this.get('view engine'),
root: this.get('views'),
engines: engines
});

That’s reasonably well guarded, because it checks that it’s not there, and sets one up if it’s not already there. But if it was cached previously, and so already set, we’re now dependent on that state, which could have been set in an entirely different way. The only thing that saves us is that the cache is pretty well private.

Then there is the bit that then looks at an instance variable that happens to be set by the constructor in this version

if (!view.path) {
var dirs = Array.isArray(view.root) && view.root.length > 1
? 'directories "' + view.root.slice(0, -1).join('", "') + '" or "' + view.root[view.root.length - 1] + '"'
: 'directory "' + view.root + '"'
var err = new Error('Failed to lookup view "' + name + '" in views ' + dirs);
err.view = view;
return fn(err);
}

So now we’ve got temporal coupling between the view’s constructor setting an instance variable and our calling code. This error check is performed synchronously after the construction of the object, which is sad, because that coupling means that any asynchronous looking up of that path is now not available to us without hackery. This is exactly what’s being introduced in Express 5, and so this calling code has to be decoupled.

This is a minor case of temporal coupling, but those pieces of Express know way too much about each other, in ways that make refactoring it more invasive.

There’s a sort of style of programming where the inner components are written first, then the outer ones are written assuming the inner ones are append-only that I think leads to this, a sort of one-way coupling.

Contrast these two places – in the View constructor:

this.path = this.lookup(name);

Where the lookup method (via some convoluted path) only returns a value when the path exists on disk:

path = join(dir, basename(file, ext), 'index' + ext);
stat = tryStat(path);

if (stat && stat.isFile()) {
return path;
}

And in the render method:

View.prototype.render = function render(options, fn) {
this.engine(this.path, options, fn);
};

So now the render method is only safe to call if this.path is set, and we’re temporally coupled to this sequence:

new View(args);
if (view.path) {
view.render(renderArgs)
}

Without that sequence – instantiate, check for errors, render if good or error if not – it’ll explode, having never validated that this.path is set..

It’s okay to temporally couple to instantiation in general – it’s not like you can call a method without an instance, not sensibly – but to that error check being required by the outside caller? That’s a terrible convention, and the whole thing would be much better enveloped in a method that spans the whole process – and in this case, an asynchronous one, so that the I/O done validating that the path exists doesn’t have to be synchronous.

So to fix this case, what I would do is to refactor the render method to include all the checks – move the error handling out of the caller, into render or something called by it. In this case, the lookup method is a prime candidate, since it’s what determines whether something exists, and the error concerns whether or not it exists.

Handling Errors in node.js

There are roughly four kinds of errors you run into in node.

synchronous code, and throw is usually limited to application logic, synchronous decisions being made from information already on hand. They can also arise from programmer error – accessing properties or functions of undefined are among the most common errors I see.

If you are calling a callback in an asychronous context provided by another module or user, it’s smart to guard these with try/catch blocks, and direct the error into your own error emission path.

The naive implementation can fail badly:

function doAThing(intermediateCallback, doneCallback) {
setImmediate(function () {
var result = intermediateCallback('someValue');
doneCallback(null, result);
});
}

The above will crash if intermediateCallback throws an exception. Instead, guard this:

function doAThing(intermediateCallback, doneCallback) {
setImmediate(function () {
try {
var result = intermediateCallback('someValue');
doneCallback(null, result);
} catch (e) {
doneCallback(e);
}
});
}

This is important since a synchronous throw in an asynchronously called function ends up becoming the next kind of error:

asynchronous calls and throw will crash your process. If you’re using domains, then it will fall back to the domain error handler, but in both cases, this is either uncatchable – a try/catch block will have already exited the block before the call is made – or you are completely without context when you catch it, so you won’t be able to usefully clean up resources allocated during the request that eventually failed. The only hope is to catch it in a process.on('uncaughtException handler or domain handler, clean up what you can – close or delete temp files or undo whatever is being worked on – and crash a little more cleanly.

Anything meant to be called asynchronously should never throw. Instead, callbacks should be called with an error argument: callback(new Error("Error message here")); This makes the next kind of error,

asynchronous calls with an error parameter in the callback receive the error as a parameter – either as a separate callback for errors, or in node, much more commonly the “error first” style:

doThing(function (err, result) {
// Handle err here if it's a thing, use result if not.
});

This forces the programmer to handle or propagate the error at each stage.

The reason the error argument is first is so that it’s hard to ignore. If your first parameter is err and you don’t use it, you are likely to crash if you get an error, since you’ll only look at the success path.

With the iferr module, you can get promise-like short-circuiting of errors:

var iferr = require('iferr');

function doThing(makeError, cb) {
setImmediate(function () {
if (makeError) {
cb(new Error('gives an error'));
} else {
cb(null, "no error!");
}
});
}

doThing(true, iferr(console.warn, function (result) {
console.log(result);
})); // This call warns with the error

doThing(false, iferr(console.warn, function (result) {
console.log(result);
})); // This call logs the "no error!" message.

Using promises also gives this short-circuit error behavior, but you get the error out of the promise with the .catch method. In some implementations, if an error happens and you haven’t set up what happens to it, it will throw after a process tick. Similarly, event emitters with unhandled error events throw an exception. This leads to the fourth kind of error:

asynchronous event emitters or promises, and error handlers

An event emitter that can emit an error event should have a handler set up.

emitter.on('error', function (err) {
// handle error here, or call out to other error handler
});

promise.catch(function (err) {
// Same here: handle it.
});

If you don’t do this, your process will crash or the domain handler will fire, and you should crash there. (Unless your promises don’t handle this case, in which case your error is lost and you never know it happened. Also not good.)