A week at npm

I finished my first week at npm today.

It’s everything I’d hoped it would be – I have excellent coworkers. The work is interesting. The business makes sense. The plans for the future are exciting, but not mind-bendingly ambitious.

Working on open source is familiar, and the workflow of using private repositories on GitHub is definitely smoother than bouncing between two separate instances using GitHub Enterprise. I think this is overlooked by corporate systems designers and security folks. Trading this ease out costs a lot in productivty and maintenance that I think is under-appreciated.

Being remote is imperfect. The tools for remote face to face meetings leave something to be desired, and doubly so with some hearing damage that makes it hard to understand words if there’s any interference or background noise. There’s still a lot of room out there for someone to get multi-party video conferencing right. Being remote from an office that has a majority of my coworkers colocated has some downsides, but my team and the company as a whole is gracious and thoughtful and caring, and that smooths over the vast majority of the rough edges.

The biggest difference is how much more processes make sense when everyone is involved and cares. So far, every decision has made sense, and it’s getting easier to trust that things are the way they are for a reason, and if they cause a problem can be changed. In comparison to a corporate bureaucracy who only occasionally manages to challenge its tendency to ossify, it’s a world of difference – without a tyrrany of structurelessness. In so many ways, npm is a traditionally structured company. A simple heirarchy of managers and reporting. Employees doing the work have the most visibility into that work, the executives have the most comprehensive ability to steer and direct, but rely on us for the insight into the details. No special organization to teams – grouped by project, people allocated according to company goals. All of this though, has an element of trust that I’ve not seen since I worked at Wondermill in 2001. People genuinely like each other, support each other, and go out of their way to make sure things work for each other. In so many ways: it feels like working with a net. A proper safety net, not something rigged up to be good enough at the moment but precarious to trust long term.

A simple approach to deploying with git without clutter

Today, I created git-create-deploy-branch after kicking some of the ideas around for a couple years.

Git at first seems to be an ideal tool for deploying web sites and other things that don’t have object code. However, it’s never been that simple, and where there’s programming, there’s automating the tedious bits and creating derivative pieces from more humane sources.

With the addition of receive.denyCurrentBranch = updateInstead in git 2.3.0, possibilities opened up for really reliable, simple workflows. They’ve since been refined, with a push-to-checkout hook allowing built objects to be created on the receiving server, but I want a more verifiable, local approach.

There are two main strategies in git for dealing with this, and before git 2.3.0, those were really the only things available. In the first, git holds only the source material, and any built products are managed outside of git, whether as a directory of numbered tarballs or in a service meant for such things. Some services like the npm registry bring a lot of value, with public access and hosting and replication available; some are little more than object storage like Amazon S3. In the second approach, built products are committed back, and git becomes a dumb content tracker – conflicts in built files are resolved by regenerating them from merged source material, and the build process becomes integral to every operation on the tree of files.

I’ve long wanted a third way, using the branching, fast, and stable infrastructure of git, while keeping the strict separation of source material and built material. I want to be able to inspect what will be deployed, and inspect the differences between what was deployed each time, and separately, analyze the changes to the source material, yet still be able to relate it to the deployed, built objects. To that end, this tool can be considered a first attempt at building tools that understand the idea of a branch derived from another.

The design is simple enough: given a branch (say master) checked out in your repository, with a build process for whatever objects need to exist in the final form, but those products ignored by a .gitignore file, like so:

source.txt:

aGVsbG8sIHdvcmxkCg==

and a build script:

build.sh:

#!/bin/sh

base64 -D < source.txt > built.txt

and an ignore file, with both the built object and other things like editor cruft:

.gitignore:

built.txt
*.swp
*~

we create a file listing the files to skip excluding when creating the derived branch, like so:

.gitdeploy:

built.txt

The initial version of the tool is very simple, and doesn’t support wildcards or any other features of any complexity in the .gitdeploy file. This is not out of a strong opinion, but as a matter of implementation simplicity, given that my prototype is written using bash.

You can install it with npm:

npm install -g git-create-deploy-branch

To create the deploy branch, we’ll run the build, then create the deploy branch with those objects present in our working directory:

./build.sh && git create-deploy-branch

Our first run gives output like so:

[new branch] 8acba8787306 deploy/master

and a branch deploy/master is created, in this case with commit ID 8acba8787306. We can show that it includes the built files:

:; git show deploy/master
commit 8acba87873062dd8b4fc516bab581a450bf9e077
Author: Aria Stewart <aredridel@nbtsc.org>
Date:   Sat Aug 8 22:30:05 2015

    deploy master

diff --git built.txt built.txt
new file mode 100644
index 000000000000..4b5fa63702dd
--- /dev/null
+++ built.txt
@@ -0,0 +1 @@
+hello, world

The commit also has the parent commit set to the current commit on master, so we can track the divergence between master and deploy/master, both expected (with the built objects) and unexpected (errant commits made on the deploy branch).

Let’s update our source, and commit that:

source.txt:

aGVsbG8sIHdvcmxkOiB3ZSBoYXZlIGNhbmR5Cg==

The repository now looks something like this:

:; git graph master deploy/master
* e391cd8deb5e - (HEAD -> master) New source (8 seconds ago) <Aria Stewart>
| * 8acba8787306 - (deploy/master) deploy master (5 minutes ago) <Aria Stewart>
|/
* 8045ecf53520 - Add .gitdeploy (5 minutes ago) <Aria Stewart>
* 0a347a1892a6 - initial commit (5 minutes ago) <Aria Stewart>

And if we run the build and deploy again:

./build.sh && git create-deploy-branch

We get output like so:

8acba8787306..16663a3ae945 deploy/master

And our repository now includes a new merge commit, showing the origin of the deployed objects, and the prior deploy:

*   16663a3ae945 - (deploy/master) deploy master (68 seconds ago) <Aria Stewart>
|\
* | e391cd8deb5e - (HEAD -> master) New source (3 minutes ago) <Aria Stewart>
| * 8acba8787306 - deploy master (7 minutes ago) <Aria Stewart>
|/
* 8045ecf53520 - Add .gitdeploy (7 minutes ago) <Aria Stewart>
* 0a347a1892a6 - initial commit (8 minutes ago) <Aria Stewart>

On a remote machine, let’s create a deploy repository, set it up to receive our deploys, and add it as a remote for us.

ssh remotemachine 'git init show-off-build && cd show-off-build && git config receive.denyCurrentBranch updateInstead && git checkout -b deploy/master'

git remote add remotemachine ssh://remotemachine/~/show-off-build

Now we can deploy this with a simple command:

git push remotemachine deploy/master

So in total, deploying a new derivative of our source code consists of making our changes and committing them, then running the build and the command to create the deploy branch, then pushing:

git commit -m changes
./build.sh && git create-deploy-branch && git push remotemachine deploy/master

Stable, traceable, reliable, replicatable builds and deploys, stored in git but not cluttering the source branch.

Let’s see our handiwork:

ssh remotemachine cat show-off-build/built.txt

And the response?

hello, world: we have candy

Debugging double-callback bugs in node.js

One of the most frustrating things that happens in a large node.js application is a double callback bug. They’re usually simple mistakes that are super tricky to track down. You may have seen one and not recognized it as such. In Express, one manifestation is Error: Can't set headers after they are sent; another one I’ve seen is an EventEmitter with an error event handler registered with ee.once('error', handler) that crashes the process saying it has an unhandled error – the first callback fires the error handler, the second triggers another error and since it was bound with once, it crashes. Sometimes they’re heisenbugs, where one path through a race condition resolves successfully, but another will manifest a crash or strange behavior.

The causes can be simple – here’s one:

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
cb(err);
}

cb(null, JSON.parse(data));
});
}

Can you spot it?

The error callback doesn’t end the function.

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

cb(null, JSON.parse(data));
});
}

This version works more acceptably if fs.readFile gives us an error. Now let’s consider what happens when there’s a JSON parse error: This crashes, since an exception thrown by JSON.parse will unwind up the stack back to fs.readFile‘s handler in the event loop, which has no try/catch and will crash your process with an uncaughtException. Let’s add an exception handler.

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
cb(null, JSON.parse(data));
} catch (e) {
cb(e);
}
});
}

Yay! That way if the JSON fails to parse, we’ll get the error in the callback. Nice and tidy, right?

Not so fast. What if cb throws an exception, like in this calling code:

readJsonAsync(function (err, json) {
if (err) {
return console.warn("Fail!", err);
}

console.log("Success! Got all kinds of excitement! Check this out!");
console.log(json.exciting.thing.that.does.not.exist);
});

Whoops. That last line throws TypeError: Cannot read property 'thing' of undefined.

That goes back to the callback function and the try/catch block, and we call back again with the error. Our callback gets called twice – which isn’t so bad with things that don’t care like console.log and console.warn, but even then, the output is confusing:

Success! Got all kinds of excitement! Check this out!
Fail! TypeError: Cannot read property 'thing' of undefined

It both worked and didn’t work! That’d crash our program if something throws an exception for a double callback. It’ll eat the error and we’d wonder why our program was misbehaving if the thing we’re calling ignored second callbacks.

We’ve also made a tricky conundrum here. There’s a lot of ways to solve it, from the ignoring multiple callbacks like so: (this example uses the once module)

var once = require('once');

function readJsonAsync(cb) {
cb = once(cb);
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
cb(null, JSON.parse(data));
} catch (e) {
cb(e);
}
});
}

to the crashing more obviously because we just don’t handle the exception, like so:

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

var parsed;
try {
parsed = JSON.parse(data);
} catch (e) {
return cb(e);
}

cb(null, parsed);
});
}

or one where we use setImmediate (or more tidily, check out the dezalgo package or the async package’s async.ensureAsync):

function readJsonAsync(cb) {
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
var parsed = JSON.parse(data);
setImmediate(function () {
cb(null, parsed);
});
} catch (e) {
setImmediate(function () {
cb(e);
});
}
});
}

This means that the caller of readJsonAsync is on their own to handle their exceptions. No warranties, if it breaks, they get to keep both pieces, et cetera. But there’s no double callbacks!

So this gets tricky when you have a whole chain of things – someone’s made a mistake in something “so simple it can’t go wrong!” like a readFile callback that parses JSON, but the double callback comes out miles away, in a callback to something in a callback to something in a callback to something in a callback that calls readJsonAsync. This isn’t an uncommon scenario – every Express middleware is a callback, every call to next calls another. Every composed callback-calling function is another layer. The distance can get pretty severe sometimes. This is one of the less-loved benefits of promises: errors are much more isolated there, and the error passing is much more explicit. I think it’s a more important point than a lot of things about promises. But that’s neither here nor there. What we’re asking is:

How do we debug doubled callbacks?!

My favorite way is to write a function that will track a double callback and log the stack trace of both paths. This is a bit like the once package, but with error logging.

Here’s a simple version.

function justOnceLogIfTwice(cb) {
var last;
return function () { // return a new function wrapping the old one.
if (!last) {
last = new Error(); // Save this for later in case we need it.
cb.apply(this, arguments); // Call the original callback
} else {
var thisTime = new Error("Called twice!");
console.warn("Callback called twice! The first time is", last.stack, "and the next time is", thisTime.stack);
// optionally, we might crash the program here if we want to be loud about errors. Like so:
setImmediate(function () {
// This is an "async throw" -- it can only be caught by error domains or the `uncaughtException` event on `process`.
throw thisTime;
});
}
};
}

We can then wrap our callbacks in it:

function readJsonAsync(cb) {
cb = justOnceLogIfTwice(cb);
fs.readFile('file.json', 'utf-8', function (err, data) {
if (err) {
return cb(err);
}

try {
cb(null, JSON.parse(data));
} catch (e) {
cb(e);
}
});
}

Now we just have to trigger the error, and we should get two stack traces, once with the success path, and once with the error path.

Other ways? Set breakpoints on the calls to cb. See what the program state is at each of them.

Try to make a reproduction case. Good luck: it’s hard.

Add once wrappers to callbacks until you find the problem. Move them deeper and deeper until you find the actual source.

Give extra scrutiny to non-obvious error paths. If you can’t spot where errors go, I’d bet money on finding part or all of the bug in there.

Add an async-tracking stack trace module like long-stack-trace or longjohn. They slow your program down and can change the behavior because of the tricks they do to get long traces, but they can be invaluable if they don’t disturb things too much.

Consider using this eslint rule to catch the simpler cases – it won’t catch all of them, but it’ll at least catch the missing return case.

Good luck!

To leave one amazing team for another

This is part announcement, part job advertisement, part musing on what it’s like to work with a really amazing team.

I’m leaving PayPal in the first week of August to join the fine people at npm, inc as the architect of the web site. It was actually one of the toughest decisions I’ve had to make, because while npm is the company I absolutely most want to work for, I really, really like my team at PayPal. I can’t think of any other company I’d leave my team for. They are kind, hard-working, honest, visionary but not obnoxiously opinionated. I’ve been given a huge amount of trust while I was there, and I’ve produced some great work. As one of my last acts for the team, I want to find someone to replace me.

For the past year, I’ve been working on KrakenJS at PayPal, doing largely open source development, and supporting application teams internally. The Kraken team is a unique team in a unique spot in the company. Our job is the open source project, advocacy for our internal developers, technological leadership, and creating common infrastructure when we can identify problems that multiple teams have. We do research and experiment with new technologies – both to vet them for stability, and to find places that will be error-prone and require caution or will impact long term maintenance.

I spent most of my year working on internationalization components. This wasn’t exactly assigned work – though someone really did need to do that work, so I jumped in and did it – but there’s a lot of things that need attention and the point of the project is to serve its users needs. It’s not there to enforce an opinion, just to solve problems, and so it does and we do. The team has worked a lot on rough consensus and running code. If someone has an idea, they prototype it and show it off to the team. Ownership is collective, but everyone takes responsibility.

Originally, Kraken was a prototyping tool used internally. The original team was taking a rough stab at some early componentizing and tooling for purely front-end work, but as time passed, the real niche showed up: an enterprise-friendly, structured but not too restrictive framework for making front-end services, first as a prototype for Java services that were not yet ready, and later, to replace those services with a node.js-based front tier. Application teams are now integrated, full-stack teams, building both in-browser and server-side components together. This has allowed a pretty unprecedented pace of development within PayPal, and in the past two and a half years, nearly every customer-facing application has been rewritten. That’s a huge amount of success enabled by the experimentation and resourcefulness of this small team. There are recordings of conference talks about this.

Recently, the team has been merged with some of the core node.js infrastructure team, now responsible for both internal architecture modules and the open source project. While the split loyalties to open source and to the internal company work are annoying, it actually works really well that way. PayPal is credibly the single largest enterprise use of node.js. I think we’ve got more developers using it than any other company, and certainly have based a large portion of our architecture on it. If someone’s having a problem with node, chances are we’ve seen the error and may well have found patterns or workarounds for development problems, and we work on getting bugs fixed upstream.

An example of one of the trickier bugs was diagnostics of a memory leak in io.js. You can see the back-and-forth with Fedor Indutny and my team on that issue, trying to diagnose what’s going on. Credit to Fedor: he knows the source of io.js better than anyone I know, particularly the TLS parts, and made tidy work of fixing it, but instrumenting, diagnosing and tracing that leak was a weeks-long process, starting in-house with monitoring noticing that a service running iojs behaved differently than the version running node 0.10 or 0.12. From there, making diagnostic framework to track what’s going on, and really digging in let us make a bug report of this caliber. Not every – or even many bugs involve that kind of to-the-metal investigation, but the team can figure out anything. They are great, kind, wonderful people.

It’s not all roses. There’s a lot of legacy baggage within the company, as any company that size and age is going to have. Enterprise constraints and organization have their own weight. Some people are resistant to change, and not every developer wants to do an amazing job in the company. Moving to new technologies and ways of doing things still require backward compatibility and migration paths, but having tools like semantic versioning and node.js’s module structure have helped a lot. Tools like Github Enterprise, Asana and Slack and HipChat have their roles in enabling this kind of change.

My workday at PayPal goes something like this:

  • An hour of technical reading – maybe about babel or one of Brendan Gregg’s performance blog posts or one of Thorsten Lorenz’s blog posts or internals of node.
  • Follow up on application crash emails – perhaps chase down a team or two who’s ignoring or doesn’t know their app is crashing, and help diagnose what’s really going on and try to get it fixed.
  • Review pull requests and issues on the Kraken open source project and its modules. It’s not an overwhelming pace, but there’s something most mornings.
  • Work for a couple hours on the modules of Kraken or internal infrastructure integration that most need it.
  • Answer internal support email about node.js and guide developers internally on how to avoid problems.
  • Maybe do a code review of an internal application, and give feedback about problems they’re likely to run into.
  • Advocate for improvements to internal infrastructure.
  • Help people on IRC between things.

In addition, I’ve spoken at several conferences, some of which PayPal has sponsored, some independently. It’s been intense but a very good experience.

It’s been a great honor to work with these fine people. Given the chance to again, there are not many places I would choose over them.

Port numbers and URLs

Today someone asked on the node.js mailing list why the URL that Express.js gave them to access their application had a port number in it, and if they could get rid of it (since other sites don’t have it.)

My explanation is this:

There are some interesting details to this!

Each service on the Internet has a port assigned to it by a group called IANA. http is port 80, ssh is 22, https is 443, xmpp is 5222 (and a few others, because it’s complicated), pop3 is 110 and imap is 143. If the service is running on its normal port, things don’t usually need to know the port because it can just assume the usual one. In http URLs, this lets us leave the port number out – http://example.org/ and http://example.org:80/ in theory identify the same thing. Some systems treat them as ‘different’ when comparing, but they access the same resource.

Now if you’re not on the default port, you have to specify – so Express apps in particular suggest you access http://localhost:8080/ (or 3000 – there’s a couple common ports for “this is an app fresh off of a generator, customize from here”). This is actually just a hint – usually they listen to more than localhost, and the report back for the URL is actually not very robust, but it works enough to get people off the ground while they learn to write web services.

If you run your app on port 80, you won’t need that.

However!

Unix systems restrict ports under 1024 as reserved for the system – a simple enough restriction to keep a user from starting up something in place of a system service at startup time, in the era of shared systems. That means you have to run something as root to bind port 80, unless you use special tools. There’s one called authbind that lets you bind a privileged port (found most commonly on Debian-derived Linuxes), one can call process.setuid and process.setgid to relinquish root privilege after binding (a common tactic in classic unix systems), though there’s some fiddly details there that could leave you exposed if someone manages to inject executable code into what you’re running. And finally, one can proxy from a ‘trusted’ system daemon to your app on some arbitrary port – nginx is a popular choice for this, as are haproxy, stunnel and others.

Now as to why it’s just a hint: the problem of an app figuring out its own URL(s) is actually very hard, unsolvable often even in simple cases, given the myriad of things we do to networking – NAT and proxies in particular confuse this – and that there’s no requirement to be able to look up a hostname for an IP address, even if the hostname can be looked up to get the IP address. None of this matters for localhost though, which has a nice known name and a nice known IP and most people do development on their own computers, and so we can hand-wave all this complexity away until later, after someone has something up and running.

Temporal Coupling is bad

In reviewing the source to express.js I came across a reasonably compact example of temporal coupling.

This is badly factored, and I’ll lay out why:

Temporal coupling is the reliance on a certain sequence of calls or checks to function, rather than having them explicitly called in order in a function. “this, then this, then this have to be called before the state you look at here will be present” is how it works out.

the bits of application.js that call the view are the start of it – the view could be there! Or not! Make one maybe!

if (!view) {
view = new (this.get('view'))(name, {
defaultEngine: this.get('view engine'),
root: this.get('views'),
engines: engines
});

That’s reasonably well guarded, because it checks that it’s not there, and sets one up if it’s not already there. But if it was cached previously, and so already set, we’re now dependent on that state, which could have been set in an entirely different way. The only thing that saves us is that the cache is pretty well private.

Then there is the bit that then looks at an instance variable that happens to be set by the constructor in this version

if (!view.path) {
var dirs = Array.isArray(view.root) && view.root.length > 1
? 'directories "' + view.root.slice(0, -1).join('", "') + '" or "' + view.root[view.root.length - 1] + '"'
: 'directory "' + view.root + '"'
var err = new Error('Failed to lookup view "' + name + '" in views ' + dirs);
err.view = view;
return fn(err);
}

So now we’ve got temporal coupling between the view’s constructor setting an instance variable and our calling code. This error check is performed synchronously after the construction of the object, which is sad, because that coupling means that any asynchronous looking up of that path is now not available to us without hackery. This is exactly what’s being introduced in Express 5, and so this calling code has to be decoupled.

This is a minor case of temporal coupling, but those pieces of Express know way too much about each other, in ways that make refactoring it more invasive.

There’s a sort of style of programming where the inner components are written first, then the outer ones are written assuming the inner ones are append-only that I think leads to this, a sort of one-way coupling.

Contrast these two places – in the View constructor:

this.path = this.lookup(name);

Where the lookup method (via some convoluted path) only returns a value when the path exists on disk:

path = join(dir, basename(file, ext), 'index' + ext);
stat = tryStat(path);

if (stat && stat.isFile()) {
return path;
}

And in the render method:

View.prototype.render = function render(options, fn) {
this.engine(this.path, options, fn);
};

So now the render method is only safe to call if this.path is set, and we’re temporally coupled to this sequence:

new View(args);
if (view.path) {
  view.render(renderArgs)
}

Without that sequence – instantiate, check for errors, render if good or error if not – it’ll explode, having never validated that this.path is set..

It’s okay to temporally couple to instantiation in general – it’s not like you can call a method without an instance, not sensibly – but to that error check being required by the outside caller? That’s a terrible convention, and the whole thing would be much better enveloped in a method that spans the whole process – and in this case, an asynchronous one, so that the I/O done validating that the path exists doesn’t have to be synchronous.

So to fix this case, what I would do is to refactor the render method to include all the checks – move the error handling out of the caller, into render or something called by it. In this case, the lookup method is a prime candidate, since it’s what determines whether something exists, and the error concerns whether or not it exists.

Handling Errors in node.js

There are roughly four kinds of errors you run into in node.

synchronous code, and throw is usually limited to application logic, synchronous decisions being made from information already on hand. They can also arise from programmer error – accessing properties or functions of undefined are among the most common errors I see.

If you are calling a callback in an asychronous context provided by another module or user, it’s smart to guard these with try/catch blocks, and direct the error into your own error emission path.

The naive implementation can fail badly:

function doAThing(intermediateCallback, doneCallback) {
    setImmediate(function () {
        var result = intermediateCallback('someValue');
        doneCallback(null, result);
    });
}

The above will crash if intermediateCallback throws an exception. Instead, guard this:

function doAThing(intermediateCallback, doneCallback) {
    setImmediate(function () {
        try {
            var result = intermediateCallback('someValue');
            doneCallback(null, result);
        } catch (e) {
            doneCallback(e);
        }
    });
}

This is important since a synchronous throw in an asynchronously called function ends up becoming the next kind of error:

asynchronous calls and throw will crash your process. If you’re using domains, then it will fall back to the domain error handler, but in both cases, this is either uncatchable – a try/catch block will have already exited the block before the call is made – or you are completely without context when you catch it, so you won’t be able to usefully clean up resources allocated during the request that eventually failed. The only hope is to catch it in a process.on('uncaughtException handler or domain handler, clean up what you can – close or delete temp files or undo whatever is being worked on – and crash a little more cleanly.

Anything meant to be called asynchronously should never throw. Instead, callbacks should be called with an error argument: callback(new Error("Error message here")); This makes the next kind of error,

asynchronous calls with an error parameter in the callback receive the error as a parameter – either as a separate callback for errors, or in node, much more commonly the “error first” style:

doThing(function (err, result) {
    // Handle err here if it's a thing, use result if not.
});

This forces the programmer to handle or propagate the error at each stage.

The reason the error argument is first is so that it’s hard to ignore. If your first parameter is err and you don’t use it, you are likely to crash if you get an error, since you’ll only look at the success path.

With the iferr module, you can get promise-like short-circuiting of errors:

var iferr = require('iferr');

function doThing(makeError, cb) {
    setImmediate(function () {
        if (makeError) {
            cb(new Error('gives an error'));
        } else {
            cb(null, "no error!");
        }
    });
}

doThing(true, iferr(console.warn, function (result) {
    console.log(result);
})); // This call warns with the error

doThing(false, iferr(console.warn, function (result) {
    console.log(result);
})); // This call logs the "no error!" message.

Using promises also gives this short-circuit error behavior, but you get the error out of the promise with the .catch method. In some implementations, if an error happens and you haven’t set up what happens to it, it will throw after a process tick. Similarly, event emitters with unhandled error events throw an exception. This leads to the fourth kind of error:

asynchronous event emitters or promises, and error handlers

An event emitter that can emit an error event should have a handler set up.

emitter.on('error', function (err) {
    // handle error here, or call out to other error handler
});

promise.catch(function (err) {
    // Same here: handle it.
});

If you don’t do this, your process will crash or the domain handler will fire, and you should crash there. (Unless your promises don’t handle this case, in which case your error is lost and you never know it happened. Also not good.)

How to Read Source Code

This is based on a talk I gave at Oneshot Nodeconf Christchurch.

I almost didn’t write this post. It seems preposterous that there are any programmers who don’t read source code. Then I met a bunch of programmers who don’t, and I talked to some more who wouldn’t read anything but the examples and maybe check if there are tests. And most of all, I’ve met a lot of beginning programmers who have a hard time figuring out where to start.

What are we reading for? Comprehension. Reading to find bugs, to find interactions with other software in a system. We read spource code for review. We read to see the interfaces, to undersand and to find the boundaries between the parts. We read to learn!

Reading isn’t linear. We think we can read source code like a book. Crack the introduction or README, then read through from chapter one to chapter two, on toward the conclusion. It’s not like that. We can’t even prove that a great many programs have conclusions. We skip back and forth from chapter to chapter, module to module. We can read the module straight through but we won’t have the definitions of things from other modules. We can read in execution order, but we won’t know where we’re going more than one call site down.

Do you start at the entry point of a package? In a node module, the index.js or the main script?

How about in a browser? Even finding the entry point, which files get loaded and how are a key task. Figuring out how the files relate to each other is a great place to start.

Other places to start are to find the biggest source code file and read that first, or try setting a breakpoint early and tracing down through functions in a debugger, or try setting a breakpoint deep in something meaty or hard to understand and then read each function in the call stack.

We’re used to categorizing source code by the language it’s written in, be it Javascript, C++, ES6, Befunge, Forth, or LISP. We might tackle a familiar language more easily, but not look at the parts written in a language we’re less familiar with.

There is another way to think of kinds of source code, which is to look at the broad purpose of each part. Of course, many times, something does more than one thing. Figuring out what it’s trying to be can be one of the first tasks while reading. There are a lot of ways to describe categories, but here are some:

Glue has no purpose other than to adjust interfaces between parts and bind them together. Not all the interfaces we want to use play nice together, where the output of one function can be passed directly to the input of another. Programmers make different decisions about the styles of interface, or adepters between systems where there are no rich data types, such as fields from a web form all represented as strings are connected to functions and objects that expect them to be represented more specifically. The way errors are handled often vary, too.

Connecting a function that returns a promise to something that takes a callback involves glue; inflating arguments into objects, or breaking objects apart into variables are all glue.

This is from Ben Drucker’s stream-to-promise:

internals.writable = function (stream) {
return new Promise(function (resolve, reject) {
stream.once('finish', resolve);
stream.once('error', reject);
});
};

In this, we’re looking for how two interfaces are shaped differently, and what’s common between them. The two interfaces involved are are node streams and promises.

In common they have the fact that they do work until they have a definite finish. In streams, with the finish event, and with promises by calling the resolution function. One thing to notice while you read this is that promises can only be resolved once, but streams can emit the same event multiple times. They don’t usually, but as programmers we usually know the difference between can’t and shouldn’t.

Here’s more glue, the sort you find when dealing with input from web forms.

var record = {
name: (input.name || '').trim(),
age: isNaN(Number(input.age)) ? null : Number(input.age),
email: validateEmail(input.email.trim())
}

In cases like this, it’s good to read with how errors are handled in mind. Look for which things in this might throw an exception, and which handle errors by altering or deleting a value.

Are these appropriate choices for the place where this exists? Do some of these conversions lose information, or are they just cleaning up into a canonical form?

Interface-defining code is one of the most important kinds. It’s what makes the outside boundary of a module, the surface area that other programmers have to interact with.

From node’s events.js

exports.usingDomains = false;

function EventEmitter() { }
exports.EventEmitter = EventEmitter;

EventEmitter.prototype.setMaxListeners = function setMaxListeners(n) { };
EventEmitter.prototype.emit = function emit(type) { };
EventEmitter.prototype.addListener = function addListener(type, listener) { };
EventEmitter.prototype.on = EventEmitter.prototype.addListener;
EventEmitter.prototype.once = function once(type, listener) { };
EventEmitter.prototype.removeListener = function removeListener(type, listener) { };
EventEmitter.prototype.removeAllListeners = function removeAllListeners(type) {};
EventEmitter.prototype.listeners = function listeners(type) { };
EventEmitter.listenerCount = function(emitter, type) { };

We’re defining the interface for EventEmitter here.

Look for whether this is complete. Look for internal details being exposed – the usingDomains in this case is a flag that is exposed to the outside world, because node domains have an effect system-wide, and debugging that is very difficult, that detail is shown outside the module.

Look for what guarantees these functions make.

Look for how namespacing works. Will the user be adding their own functions, or does this stand on its own, and the user of this interface will keep their parts separate?

Like glue code, look for how errors are handled and exposed. Is that consistent? Does it distinguish errors due to internal bugs from errors because the user made a mistake?

If you have strong interface contracts or guards, this is where you should expect to find them.

Implementation once it’s separated from the interface and the glue is one of the more studied parts of source code, and where books on refactoring and source code style aim much of their advice.

From Ember.Router:

startRouting: function() {
this.router = this.router || this.constructor.map(K);

var router = this.router;
var location = get(this, 'location');
var container = this.container;
var self = this;
var initialURL = get(this, 'initialURL');
var initialTransition;

// Allow the Location class to cancel the router setup while it refreshes
// the page
if (get(location, 'cancelRouterSetup')) {
return;
}

this._setupRouter(router, location);

container.register('view:default', _MetamorphView);
container.register('view:toplevel', EmberView.extend());

location.onUpdateURL(function(url) {
self.handleURL(url);
});

if (typeof initialURL === "undefined") {
initialURL = location.getURL();
}
initialTransition = this.handleURL(initialURL);
if (initialTransition && initialTransition.error) {
throw initialTransition.error;
}
},

This is the sort always needs more documentation about why it is how it is, and not so much about what these parts do. Implementation source code is where the every-day decisions about how something is built live, the parts that make this module do what it does..

Look how this fits into its larger whole.

Look for what’s coming from the public interface to this module, look for what needs validation. Look for what other parts this touches – whether they share properties on an object or variables in a closure or call other functions.

Look at what would be likely to break if this gets changed, and look to the test suite to see that that is being tested.

Look for the lifetime of these variables. This particular case is an easy one: This looks really well designed and doesn’t store needless state with a long lifetime – though maybe we should look at _setupRouter next if we were reading this.

You can look to understand the process entailment of a method or function, the things that were required to set up the state, the process entailed in getting to executing this. Looking forward from potential call sites, we can ask “How much is required to use this thing correctly?”, and as we read the implementation, we can ask “If we’re here, what got us to this point? What was required to set this up so that it works right?”

Is that state explicit, passed in via parameters? Is it assumed to be there, as an instance variable or property? Is there a single path to get there, with an obvious place that state is set up, or is it diffuse?

Algorithms are a kind of special case of implementation. It’s not so exposed to the outside world, but it’s a meaty part of a program. Quite often it’s business logic or the core processes of the software, but just as often, it’s something that has to be controlled precisely to do its job with adequate speed. There’s a lot of study of algorithmic source code out there because that’s what academia produces as source code.

Here’s an example:

function Grammar(rules) {
// Processing The Grammar
//
// Here we begin defining a grammar given the raw rules, terminal
// symbols, and symbolic references to rules
//
// The input is a list of rules.
//
// The input grammar is amended with a final rule, the 'accept' rule,
// which if it spans the parse chart, means the entire grammar was
// accepted. This is needed in the case of a nulling start symbol.
rules.push(Rule('_accept', [Ref('start')]));
rules.acceptRule = rules.length - 1;

// Build a list of all the symbols used in the grammar so they can be numbered instead of referred to
// by name, and therefore their presence can be represented by a single bit in a set.
function censusSymbols() {
var out = [];
rules.forEach(function(r) {
if (!~out.indexOf(r.name)) out.push(r.name);

r.symbols.forEach(function(s, i) {
var symNo = out.indexOf(s.name);
if (!~out.indexOf(s.name)) {
symNo = out.length;
out.push(s.name);
}

r.symbols[i] = symNo;
});

r.sym = out.indexOf(r.name);
});

return out;
}

rules.symbols = censusSymbols();

This bit is from a parser engine I’ve been working on called lotsawa. Reads like a math paper, doesn’t it?

It’s been said a lot that good comments say why something is done or done that way, rather than what it’s doing. Algorithms usually need more explanation of what is going on since if they were trivial, they’d probably be built into our standard library. Quite often to get good performance out of something, the exactly what-and-how matters a lot.

One of the things that you usually need to see in algorithms is the actual data structures. This one is building a list of symbols and making sure there’s no duplicates.

Look also for hints as to the running time of the algorithm. You can see in this part, I’ve got two loops. In Big-O notation, that’s O(n * m), then you can see that there’s an indexOf inside that. That’s another loop in Javascript, so that actually adds another factor to the running time. (twice – looks like I could make this more optimal by re-using one of the values here)

Configuration The line between source code and configuration file is super thin. There’s a constant tension between having a configuration be expressive and readable and direct.

Here’s an example using Javascript for configuration.

app.configure('production', 'staging', function() {
app.enable('emails');
});

app.configure('test', function() {
app.disable('emails');
});

What we can run into here is combinatorial explosion of options. How many environments do we configure? Then, how many things do we configure for a specific instance of that environment. It’s really easy to go overboard and end up with all the possible permutations, and to have bugs that only show up in one of them. Keeping an eye out for how many degrees of freedom the configuration allows is super useful.

Here is bit of kraken config file.

"express": {
"env": "", // NOTE: `env` is managed by the framework. This value will be overwritten.
"x-powered-by": false,
"views": "path:./views",
"mountpath": "/"
},

"middleware": {

"compress": {
"enabled": false,
"priority": 10,
"module": "compression"
},

"favicon": {
"enabled": false,
"priority": 30,
"module": {
"name": "serve-favicon",
"arguments": [ "resolve:kraken-js/public/favicon.ico" ]
}
},

Kraken took a ‘low power language’ approach to configuration and chose JSON. A little more “configuration” and a little less “source code”. One of the goals was keeping that combinatorial explosion under control. There’s a reason a lot of tools use simple key-value pairs or ini-style files for configuration, even though they’re not terribly expressive. It’s possible to write config files for kraken that vary with a bunch of parameters, but it’s work and pretty obvious when you read it.

Configuration has some interesting and unique constraints that are worth looking for.

The lifetime of a configuration value is often determined by other groups of people. They usually vary somewhat independently of the rest of the source code – hence why they’re not built in as hard-coded values inline.

They often need machine writability, to support configuration-generation tools.

The responsible people are different than regular source code. Systems engineers, operations and other people can be involved in the creation.

Configuration values often have to fit in weird places like environment variables, where there are no types, just string values.

They also often store security-sensitive information, and so won’t be committed to version control because of this.

Batches are an interesting case as well. They need transactionality. Often, some piece of the system needs to happen exactly once, and not at all if there’s an error. A compiler that leaves bad build products around is a great source of bugs. Double charging customers is bad. Flooding someone’s inbox because of a retry cycle is terrible. Look for how transactions are started and finished – clean-up processes, commit to permanent storage processes, the error handling.

Batch processes often need resumabilty. A need to continue where they left off given the state of the system. Look for the places where perhaps unfinished state is picked up and continued from.

Batch processes are also often sequential. If they’re not strictly linear processes, there’s usually a very directed flow through the program. Loops tend to be big ones, around the whole process. Look for those.

Reading Messy Code

So how do you deal with this?

      DuplexCombination.prototype.on = function(ev, fn) {
switch (ev) {
case 'data':
case 'end':
case 'readable':
this.reader.on(ev, fn);
return this
case 'drain':
case 'finish':
this.writer.on(ev, fn);
return this
default:
return Duplex.prototype.on.call(this, ev, fn);
}
};

You are seeing that right. That’s reverse indendation. Blame Isaac.

Put on your rose tinted glasses!

Try installing a tool like standard or jsfmt. Here’s what standard -F dc.js does to that reverse-indented Javascript:

DuplexCombination.prototype.on = function (ev, fn) {
switch (ev) {
case 'data':
case 'end':
case 'readable':
this.reader.on(ev, fn)
return this
case 'drain':
case 'finish':
this.writer.on(ev, fn)
return this
default:
return Duplex.prototype.on.call(this, ev, fn)
}
}

It’s okay to use tools while reading! There’s no technique that’s “cheating”.

Here’s another case:

(function(t,e){if(typeof define==="function"&&define.amd){define(["underscore","
jquery","exports"],function(i,r,s){t.Backbone=e(t,s,i,r)})}else if(typeof export
s!=="undefined"){var i=require("underscore");e(t,exports,i)}else{t.Backbone=e(t,
{},t._,t.jQuery||t.Zepto||t.ender||t.$)}})(this,function(t,e,i,r){var s=t.Backbo
ne;var n=[];var a=n.push;var o=n.slice;var h=n.splice;e.VERSION="1.1.2";e.$=r;e.
noConflict=function(){t.Backbone=s;return this};e.emulateHTTP=false;e.emulateJSO
N=false;var u=e.Events={on:function(t,e,i){if(!c(this,"on",t,[e,i])||!e)return t
his;this._events||(this._events={});var r=this._events[t]||(this._events[t]=[]);
r.push({callback:e,context:i,ctx:i||this});return this},once:function(t,e,r){if(
!c(this,"once",t,[e,r])||!e)return this;var s=this;var n=i.once(function(){s.off
(t,n);e.apply(this,arguments)});n._callback=e;return this.on(t,n,r)},off:functio
n(t,e,r){var s,n,a,o,h,u,l,f;if(!this._events||!c(this,"off",t,[e,r]))return thi
s;if(!t&&!e&&!r){this._events=void 0;return this}o=t?[t]:i.keys(this._events);fo
r(h=0,u=o.length;h<u;h++){t=o[h];if(a=this._events[t]){this._events[t]=s=[];if(e
||r){for(l=0,f=a.length;l<f;l++){n=a[l];if(e&&e!==n.callback&&e!==n.callback._ca
llback||r&&r!==n.context){s.push(n)}}}if(!s.length)delete this._events[t]}}retur
n this},trigger:function(t){if(!this._events)return this;var e=o.call(arguments,
1);if(!c(this,"trigger",t,e))return this;var i=this._events[t];var r=this._event
s.all;if(i)f(i,e);if(r)f(r,arguments);return this},stopListening:function(t,e,r)
{var s=this._listeningTo;if(!s)return this;var n=!e&&!r;if(!r&&typeof e==="objec

Here’s the start of that after uglifyjs -b < backbone-min.js:

(function(t, e) {
if (typeof define === "function" && define.amd) {
define([ "underscore", "jquery", "exports" ], function(i, r, s) {
t.Backbone = e(t, s, i, r);
});
} else if (typeof exports !== "undefined") {
var i = require("underscore");
e(t, exports, i);
} else {
t.Backbone = e(t, {}, t._, t.jQuery || t.Zepto || t.ender || t.$);
}
})(this, function(t, e, i, r) {
var s = t.Backbone;
var n = [];
var a = n.push;
var o = n.slice;
var h = n.splice;
e.VERSION = "1.1.2";
e.$ = r;
e.noConflict = function() {

Human parts and guessing the intent of what you’re reading

There’s a lot of tricks for figuring out what the author of something meant.

Look for guards and coercions

if (typeof arg != 'number') throw new TypeError("arg must be a number");

Looks like the domain of whatever function we’re in is ‘numbers’.

arg = Number(arg)

This coerces its input to be numeric. Same domain as above, but doesn’t reject errors via exceptions. There might be NaNs though. Probably smart to read and check if there’s comparisons that will be false against those.

NaN behavior in javascript mostly comes from the behavior in the IEEE floating-point number spec, as a way to propagate errors out to the end of a computation so you don’t get an arbitrary bogus result, and instead get a known-bad value. In some cases, that’s exactly the technique you want.

Look for defaults

arg = arg || {}

Default to an empty object.

arg = (arg == null ? true : arg)

Default to true only if a value wasn’t explicitly passed. Comparison to null with the == operator in Javascript is only true when what’s being compared is null or undefined – the two things that mean “nothing to see here” – this particular check hints that the author meant that any value is acceptable, as long as it was intended to be a value. false and 0 are both things that would override the default.

arg = (typeof arg == 'function' ? arg : function () {});

In this case, the guard uses a typeof check, and chooses to ignore its argument if it’s not the right type. A silent ignoring of what the caller specified.

Look for layers

As an example, req and res from Express are tied to the web; how deep do they go? Are they passed down into every layer, or is there some glue that picks out specific values and calls functions with an interface directly related to its purpose?

Look for tracing

Are there inspection points?

Debug logs?

Do those form a complete narrative? Or are they ad-hoc leftovers from the last few bugs?

Look for reflexivity

Are identifiers being dynamically generated? If so, that means you won’t find them by searching the source code – you’ll have to think at a different level to understand parts of what’s going on.

Is there eval? Metaprogramming? New function creation?

func.toString() is your friend! You can print out the source of a callback argument and see what it looks like, you can insert all kinds of debugging to see what things do.

Look at lifetimes

The lifetime of variables is particularly good for figuring out how something is built (and how well it’s built). Look for who or what initializes a variable or property of an object. Look for when it changes, and how that relates to the flow or scope of the process that does it. Look for who changes it, and see how related they are to the part you’re reading.

Look to see if that information is also somewhere else in the system at the same time. If it is, look to see if it can ever be inconsistent, where two parts of the system disagree on what that value is if you were to compare them.

Somewhere, someone typed the value you see into a keyboard, generated it from a random number generator, or computed it and saved it.

Somewhere else, some time else, that value will affect some human or humans. Who are these people?

What or who chooses who they are? Is that value ever going to change? Who changes it?

Maybe it’s a ‘name’ field typed into a form, then saved in a database, then displayed to the user. Stored for a long time, and it’s a value that can be inconsistent with other state – the user can change their name, or use a different one in a new context.

Look for hidden state machines

Sometimes boolean variables get used together as a decomposed state machine

Maybe there’s a process with variables like this:

var isReadied = false;
var isFinished = false;

The variables isReadied and isFinished might show a state machine like so:

START -> READY -> FINISHED

If you were to lay out how those variables relate to the state of the process, you might find this:

isReadied | isFinished | state
----------|------------|------------
false     | false      | START
false     | true       | invalid
true      | false      | READY
true      | true       | FINISHED

Note that they can also express the state !isReadied && isFinished – which might be an interesting source of bugs, if something can end up at the finished state without first being ready.

Look for composition and inheritance Is this made of parts I can recognize? Do those parts have names?

Look for common operations

map, transforming a list of values into a different list of values.

reduce, taking a list of values and giving a single value. Even joining an array of strings with commas to make a string is a ‘reduce’ operation.

cross-join, where two lists are compared, possibly pairwise, or some variation on that.

It’s time to go read some programs and libraries.

Enjoy!

Off to New Zealand

As I write this, I’m on the airplane from Boston to San Francisco, for the first of a three-leg trip to Christchurch, New Zealand for a Oneshot Nodeconf. I’m giving a talk on How To Read Source Code, which I’ve been meaning to make into a blog post for a long time, and now I’ve got my slides as a source of material to edit. I’ll probably start doing that on the trip home, after my tweaks and changes settle down.

I really like the Oneshot Nodeconf format: there is only one talk going on at a time, so there’s no competition for whose talk to go to. They’re usually a bit more curated than grab bag events, though they usually have a pretty diverse set of talks. I think knowing that everyone will be listening to their talk makes speakers put a little extra effort into being engaging.

Out the window are the Mondrian patterns of the California Central Valley, all square fields and section roads. Fifteen minutes to go!

Why the quality of teaching programming is so bad

A friend asked why her R statistics programming course on a MOOC was so terrible. She said 90% of the information on the quizzes was in the lecture, but the other 10%? Left for you to discover on your own.

Welcome to the problems I am struggling with. I am now a programming teacher, in most ways that matter. My newest job is about half research and half teaching. What you’re finding is completely the norm, and in fact I’d say 90% is pretty good. Sad facts.

The terrifying status quo is that we have a sixty year old field, one that started as self teaching during the early years, some very smart mathematicians and electrical engineers ended up figuring out how it can and should work, but the early perception was that designing programs was the hard part, conceiving of the math to represent them, and the actual programming the math into the computer was a technician’s job. (Notably, programmers were usually women. Program designers and system architects were usually men. This turns out to be relevant.)

As the field started to grow, programming started to be recognized as requiring the bulk of the problem solving skills, since efficiently encoding mathematics, where a symbol might mean “with all values from zero to infinity” into a computer with only thousands of words of memory took clever reworking of problems. The early work was largely uncredited, mere “entering the program into the computer”.

In the late 70s and into the 80s there was a land grab for the prestige of being a programmer. A new labor category of “ software engineer” was created, a professional engineering job, not the mere technician of being a programmer. Women were excluded from programming, sometimes deliberately by male programmers, sometimes as a matter of practice by engineering schools.

With this shift, programming became a field where training on the job, expecting no familiarity to begin and a few established training programs was replaced by engineering school, and assuming that the discipline is a field of either mathematics or of electrical engineering, and programming courses became upper division electives for engineers working largely in theory. All of this is counter to the people (particularly Margaret Hamilton) who started trying to make software engineering a discipline, but the gestalt of the industry has definitely gone away from teaching being valued.

The net effect of that shift is that the pedagogy of teaching programming was interrupted.

A few training programs remained, but usually tied to industry, and particular companies. The industry balkanized significantly in this period, so IBM would teach IBM programming, and Oracle would teach Oracle programming. The abstract skills of programming are highly portable between languages and fields, but at the raw syntax of a given programming language, the details matter.

Now, another relevant thing is that computers have been sustaining a tremendous pace of development for these sixty years. With a roughly doubling in computation of a chip every 18 months, there have been significant periods where practices would be introduced and thrown away and replaced much faster than the cycle of getting a student through college and in to an adjunct professor’s seat. What they were taught as an entry level student is no longer used, or is wrong in some way if they go to teach that knowledge by the time they’re in a position to assist a teacher or teach themselves.

Both of these have caused most programming teaching to avoid specifics and to only teach the most abstract portions, the parts that will have a longer shelf-life than the details, and to avoid being entrenched in only one part of the industry.

Some schools are finally climbing their way out of this – MIT now teaches Java, an industrial rather than academic language instead of the prior Scheme language, and some European software shops are starting to use Haskell, which started as an academic language, so the crossover is finally happening, but it’s a slow process.

It’s all screwed up. Specifics of systems are needed to actually learn and build things, but the academic process is largely in abstract terms, and bridging that gap is difficult. On top of that, there’s the notion that some people are inherently good at programming, probably derived from similar thoughts about math, so there’s a certain impatience for explaining of, and arrogant derision for people who don’t know the details.

So what’s someone to do?

At this moment, programming specifics are usually peer-taught, so working with people who’ve worked with the specific system and can advise about the syntax and specifics is important. Even in industry, this is recognized, if informally by the practice of ‘pair programming’. Seek classes that get the details out, not just the theory. It will be a mixed bag, but there are good classes out there – just know that ‘good teaching’ of programming is not something systematically understood, and not universally valued.

Creating just online social spaces

The last two months have seen two Slack chats start to support marginalized groups in the technology field, LGBTQ* Technology and Women in Technology, and we’ve had a lot of discussions about how to run the spaces effectively, not just being a place for those who it says on the tin, but to support, encourage and not be terrible to people who are marginalized in other ways than the one the particular group is trying to represent.

This is a sort of how-to guide for creating a social Slack that is inclusive and just, and a lot of of this will apply to other styles and mediums for interaction.

The problem begins thus: How do you keep a Slack started by a white gay cisgender man from reflecting only that as a core group? How do you keep a women in technology chat from being run entirely by white women of (relative) affluence afforded by tech industry positions, leaving women of color, trans women, people with disabilities out in the cold?

Making just social spaces is not a one time structural setup, though things like a good Code of Conduct is an important starting place, and there are difficult balances to strike.

Make sure there is sufficient representation. Social spaces grow from their seed members, and as it’s been studied, people’s social networks tend to be racially and genderwise insular; White members beget more white members; men bring more men, especially in technology as we’ve found. If a space is insufficiently representative of the diversity of experiences that should be there, people will leave, having seen yet another space that isn’t “for” them. So, too, power structures reflect the initial or core body of a social group, and a social group will tend to reflect the demographics of those in positions of power, creating a feedback cycle that will be hard to break without a lot of effort. Seed your network as broadly as you can, and put people without homogenous backgrounds in power.

Empower a broad group. A few admins can’t guide and create the shape of the space alone, so empower users to make positive change themselves.

Plan for timezones. If your chat starts off with US users, you will find that they will dominate the space during US waking hours. You may find an off-peak group in Europe, with an almost entirely separate culture. Bridging the gap with admins in other timezones to help consistently guide the shape of the group can be helpful.

Your users will have reactions to media posted. In particular, seizure disorders can be triggered by flashing animated GIFs. Building an awareness into your social space early can help make sure these are not posted or restricted to certain channels. Likewise, explicit imagery, upsetting news and articles can be marked or restricted, even without banning it entirely.

Plan for how to resolve conflicts. While outright malicious violation of a Code of Conduct can be solved by ejecting members, most cases of conflict are more nebulous, or not so extreme nor malicious that a first offense should involve removal from the space. Slack in particular has let the LGBTQ* Tech group practice a group form of conflict resolution. We created a #couldhavegonebetter channel. When a conversation strays off the rails, into vindictive, oppressive by a member of a relatively privileged group, or evangelizing views that make others uncomfortable, a strategy that has worked well is to end the conversation with “That #couldhavegonebetter”, force-invite the users involved into the channel, and start with a careful breakdown of how the discussion turned problematic. This gives a place to discuss that isn’t occupying the main space; those who care about conflict resolution can join the channel. It’s not super private, but it’s equivalent of taking someone aside in the hallway at a conference rather than calling them out in front of an auditorium full of their peers. De-escalation works wonderfully.

Keep meta-discussion from dominating all spaces. It’s a human tendency to navel-gaze, doubly so in a social space, where the intent of the members shapes the future of the space. That said, it can dominate discussion quickly, and so letting meta-discussion happen in channels separate from the thing it’s discussing can keep the original purpose of channels intact.

Allow the creation of exclusive spaces. Much of the time, especially socially, marginalized people need a place that isn’t dominated or doesn’t have the group who talks over them most: people of color need to escape white people, trans people need to escape cisgender people, people outside the US need space to be away from American-centric culture and assumptions, and not-men need to be able to have space that is not dominated by men. It has ended up being the least problematic to allow the creation of spaces that are exclusive of the dominant group, just to give breathing room. It feels weird, but like a slack focused on a marginalized group as a whole, sometimes even breaking things down further lets those at the intersection of multiple systems of oppression lighten the load a bit.

A chat system with a systemwide identity has different moderation needs than one that does not. A problem found on IRC is that channels are themselves the unit of social space allocation. There is no related space that is more or less intimate than the main group, and so conversations can’t be taken elsewhere, and channelization balkanizes the user group. With Slack, this is not true. Channels are cheap to create, and conversations can flow between channels thanks to hyperlinks.

Allow people to opt out generally, and in to uncomfortable or demanding situations. A great number of problems can be avoided by making it possible to opt out without major repercussions. Avoid lots of conversation in the must-be-present #general channel, howver it’s been renamed. (#announcements in one place, #meta in another). Default channels, auto-joined by new users should be kept accessible. Work-topical channels should be kept not-explicit, non-violent spaces, so they are broadly accessible. Leave explicit imagery in its own channels, let talk about the ills of the world be avoided. And keep the volume low in places people can’t leave if they’ll be in the Slack during their workday.

Good luck, and happy Slacking!

Why MVC doesn't fit the web

A common set of questions that come up on IRC around node web services revolve around how to do MVC “right” using tools like express.

The short answer: Don’t.

A little history. “MVC” is an abbreviation for “Model, View, Controller”. It’s a particular way to break up the responsibilities of parts of a graphical user interface application. One of the prototypical examples is a CAD application: models are the objects being drawn, in the abstract: models of mechanical parts, architectural elevations, whatever the subject of the particular application and use is. The “Views” are windows, rendering a particular view of that object. There might be several views of a three-dimensional part from different angles while the user is working. What’s left is the controller, which is a central place to collect actions the user is performing: key input, the mouse clicks, commands entered.

The responsibility goes something like “controller updates model, model signals that it’s been updated, view re-renders”.

This leaves the model relatively unencumbered by the design of whatever system it’s being displayed on, and lets the part of the software revolving around the concepts the model involves stay relatively pure in that domain. Measurements of parts in millimeters, not pixels; cylinders and cogs, rather than lines and z-buffers for display.

The View stays unidirectional: it gets the signal to update, it reads the state from the model and displays the updated view.

The controller even is pretty disciplined and takes input and makes it into definite commands and updates to the models.

Now if you’re wondering how this fits into a web server, you’re probably wondering the same thing I wondered for a long time. The pattern doesn’t fit.

On the web, we end up with a pipeline something like “Browser sends request to server, server picks a handler, handler reads request and does actions, result of those actions is presented to a template or presentation layer, which transforms it into something that can be sent, which goes out as a response to the browser.”

request -> handler -> presentation -> response

It still makes sense to separate out the meat of the application from the specifics of how it’s being displayed and interfaced to the world, often, especially if the application manipulates objects that are distinctly separate. A example might be that an accounts ledger makes no sense to bind the web portions to the data model particularly tightly. That same ledger might be used to generate emails, to generate print-outs, and later to generate reports in a completely different system. The concept of a “model” or a “business domain logic” layer to an application makes sense:

request -> handler -> presentation -> response
             ^
             |
             v
       business logic

But some time in the mid-2000s, someone thought to try to shoehorn the MVC concept into this pipeline, and did so by renaming these components:

request -> controller -> model -> view -> response

And this is why we end up with relatively well-defined models, since that makes sense, and ‘views’ are a less-descriptive name for templating and presentation logic. What’s left ends up being called a ‘controller’ and we start a lot of arguments about whether a given bit of logic belongs there or in the model.

So in express, let’s refer to models and domain logic, to handlers and to templates. We’ll have an easier time of it.

Handlers accept web-shaped data: query strings and post data, and shape them into something the business logic can deal with. When the business logic emits something we should display, that same handler can pass it off to templates, or in the case of data being rendered in the browser by a client there, serialized directly as json and sent off as the response. Let the business logic know little about the web, unless its concern is the web as in a content management sytem. Let our handlers adapt the HTTP interface to the business logic, and the responses out to our presentation layer, even if that’s as simple as filling in values in a template.

We’ll all be a lot happier if MVC keeps its meaning as a paradigm for breaking up responsibility within a GUI.

Design Ethos

I just realized that my entire software design ethos is ‘power to the people’.

I started to argue over whether an interface (one that modifies some mutable object, however unfortunate it is) should no-op, throw an exception, or warn when it’s already been done once and runs again.

To no-op is to say “we know better than you and will do what we consider the Right Thing”.

To throw an exception is to say “we know better than you and will make you do what we consider the Right Thing”.

To warn the developer using the module is to say “we have more experience here, and say what we think … but your call. Go for it!”

A social software toolbox

Rate Limiting can be implemented as a way to deter high-cost actions, whether the cost of technical details like API calls, or socially expensive like posting comments, where one or two is easy to keep up with, but many can be a burden on the receiver. Well chosen, they can be invisible to users who are not actively being malicious; poorly chosen or bound to technical rather than social concerns, they can be arbitrary and frustrating limits.

Tarpitting is adding rate limits that are just not satisfiable to a malicious user, frustrating them into giving up.

Delay can be a mild form of rate limiting that makes users who are overwhelming the system or other people experience the system as slower and less pleasant to use.

Blocking most often makes users invisible to each other. In the case of public postings, it usually means that one user can’t share the other’s postings or otherwise interact with them, though they can see posts.

Muting simply ignores an undesirable user’s posts.

It’s interesting to note that more marginalized people prefer to block, and less marginalized prefer muting. There are a lot of subtle dynamics in these interactions. Given a private backchannel that doesn’t respect blocking, blocking a user will cause a harasser to escalate privately.

Penalty box is a timed block, shadowban or teergrube that expires, giving users time to cool down. When under a user’s control, can help separate bad actor blocking from merely not wanting to deal with someone at the current time.

Private backchannel can allow someone who wishes to connect a way to do so without being public, but can also allow a harasser to privately act poorly while maintaining public good standing. Direct messages are Twitter’s backchannel; replies to author only are a mailing list’s backchannel.

Privacy groups The permission model of Livejournal, posts can be restricted to a single privacy group (a list of users) and only viewed or shared within that group.

Friending is initiating a symmetrical relationship, complete only when confirmed by the other party.

Open follow is initiating a one-way relationship, usually expressing interest by the follower in the followee.

Approved follow is initiating a one-way relationship, as in open follow, but requiring the followee to approve the action, as in friending.

Private account is disabling public visibility of the posts in an account, usually making them vet followers as in approved follow.

Upvote/Downvote are a popular way to weed out chaff from a conversation, where offtopic, rude or poorly written comments are downvoted by a community, and popular, funny, or insightful comments are upvoted. It can be problematic when the culture of a community itself reinforces poor choices, and it’s subject to gaming via social campaigns.

Reflection is the act of restating a comment when replying to it. Requiring a commenter to first restate and reflect what the original poster said before posting their reply is an interesting way to try to suppress flame wars of misunderstanding, and also increase the expense of malicious comments. I know of no system that has ever implemented this, but it was proposed by @RebeccaDotOrg and I think it’s a fantastic idea for debate where actual exploration or consensus on a hot issue is interesting.

Shadowbanning is redirecting a malicious user to a dummy version of the site to interact with where their actions will never be seen by real human beings. Often combined with tarpitting or ratelimiting.

Sentiment analysis is a way to automatically try to ascertain whether a comment is positive or negative, or whether it’s inflammatory, and whether to trigger some of the other countermeasures.

Subtweet is commenting in a chronologically related but not directly connected conversation. A side commentary, usually among a sub- or in-group.

Trackback is automated notification to an original post or hosting service when a reply or mention is generated on another site.

Flat commenting is the form typically used by forum software, where posts are chronological or reverse chronological below a topic post.

Threaded commenting is used in some environments like Reddit, Metafilter, Live Journal and some email clients where each message is shown attached to the one it replies to, giving subtrees that often form entirely different topics.

Weakly threaded commenting Threading only shown for conversation entries from followers. Often implemented client-side, given an incomplete reply graph.

Real identity can cause some commenters to behave, particularly in contexts associated with their job.

Pseudonymous identity can give stability to conversations over time, showing that the same actors are present in conversations. If easy to create more identities, can yield sockpuppeting.

Anonymous identity can create a culture of open debate where identity politics are less prominent, but can let some people play their own devil’s advocate and can launch completely unaccountable attacks.

Cryptographic identity are interesting in that there is no central authority, and they can often not be revoked (there’s no way to ban an identity systemically without cooperation). Cryptographic names are often not human-memorable, thanks to the constraints of Zooko’s Triangle. It’s possible to work around, but the systems for doing so are cumbersome in their own right.

Invites are often used to make sure that the social group grows from a known seed; because social networks are often strictly divided by race and gender, the often serves to make the group homogenous over certain traits, despite not having selected for these traits specifically. It can also rate-limit the growth of any one group, given enough seeding of minority or otherwise oppressed groups to let a more diverse pattern form, if seeding is chosen carefully.

Invite trees are a pattern where each user can invite some other users, but is in some way ‘responsible’ for their behavior, which limits the possibility that invites are sold openly, and can in some cases keep out certain surveiling users.

I’m sure there are a great number of patterns I’ve missed, but cataloguing these and calling out the differences may help make us more aware of the tools we have at our disposal in creating social networks.

Why is it so hard to evolve a programming language?

Parsers.

We use weak parsing algorithms – often hand-written left-leaning recursive descent parsers. Sometimes PEGs. Usually with a lexing layer that treats keywords specially, annotating them as a particular part of speech without that being a function of the grammar, but the words themselves.

This makes writing a parser easy, particularly for those hand-written parsers. Keywords are also a major reason we can’t evolve languages: adding new words breaks old programs that were already using them.

The alternative is to push identification of keywords into the grammar, and out of the lexer. This means that part of speech for a word can be determined by where it’s used. This allows some weird language, but it keeps things working well. Imagine javascript letting you have var var =. It’s not ambiguous, since a keyword can’t appear as a variable name, positionally. The first var can’t be known whether it’s a keyword or variable name without some lookahead, though: var = would be a variable name and var foo would be a keyword.

This usually means using better parsers. Hand written parsers could maintain a couple tokens buffered state, allowing an unshift or two to put tokens back when a phrase doesn’t match; generated parsers can do better and use GLR, and a fully dynamic parser working off of the grammar as a data structure can use Earley’s algorithm.

These are problematic for PEGs though. They won’t backtrack and figure out which interpretation is correct. Once a PEG has chosen a part of speech for a word, it sticks. That’s the rationale behind its ordered choice operator: one must have clear precedence. It’s in essence an implicit way to mark which part of speech something is in a grammar.

Backward-incompatible changes

It’s always tempting to get a ‘clean break’ on a language; misfeatures build up as we evolve it. This is the biggest disservice we can do our users: a clean break breaks every program they have ever written. It’s a new language, and you’re starting fresh.

Ways forward

Pragmas. "use strict" being the one Javascript has. They’re ugly, they don’t scale that well, so they have to be kept to a minimum. Version selection form mutually exclusive pragmas. This is what Netscape and Mozilla did to opt in to new features: <script language='javascript1.8'>. The downside here is that versioning is coarse, and doesn’t let you mix and match features. Scoping "use strict" to the function in ES5 was smart, in that it allows us to use the lexical scope as a place where the language changes too.

The complexity with "use strict" is that it changes things more than lexically: Functions declared in strict mode behave differently, and if you’re clever, you can observe this from the outside, as a caller, and that’s a problem for backward compatibility.

Support multiple sub-languages. In a parser that can support combining grammars (Earley’s algorithm and combinator parsers for pure LL languages in particular are good at this, though PEGs are not). If someone elects a different language within a region of the program, this is possible. Language features can be left as orthogonal layers. How one would express that intent is unexplored, though. Too few people use the tools that would allow this.

Versions may really be the best path forward. Modular software can be composed out of multiple files, and with javascript in the browser in particular, we’ll have to devise other methods; transport of unparsed script is already complex.

We should separate the parser from the semantics of a language: Let there be one, two, even ten versions of the syntax available, and push down to a more easily versioned (or not at all) semantic layer. This is where Python fell down without needing to. The old cruft could have been maintained and reformed in terms of the new concepts from Python3.