An AI Haters Guide to Code with LLMs (Philosophy & Personal Politics)

This is going to be a little bit oblique because I view the world somewhat differently than many people: I take an ecological view of the world, in systems and connections not just with rules. Feedback loops matter so very much, not just statistics. Statistics are a snapshot. When something disrupts an ecosystem, everything moves around. Some things die out. New things take over - sometimes things that were there before but held back by something that’s now gone. Everything changes.

Everything is changing right now.

I believe that purist approaches to mitigating things are rarely useful. They are purely trying to return to a past that has already gone, or never existed in the first place. Our world is an ecology of ideas and actions and interconnected systems, and they can’t be spun backward to get to some more pure, earlier state. The real world is non-linear, full of feedback loops and pitfalls. Applying the brake as hard as you can won’t stop you when there’s a heavy engine and a million hands pushing you toward a cliff. You have to steer. And if things are really bad, you have to choose where to crash, because going off the cliff is the worst option.

We have to ask ourselves - and this will be ongoing - what regulates what’s going on?

Laws matter, but laws are only as powerful as our collective belief in their working. It would be nice to live in a world of laws again, assuming the laws are vaguely just.

Just as much the AI boosters and techno-optimists are wrong: progress is written in blood, usually of the outclasses. There is no flywheel of progress that leads to unambiguously better outcomes for everyone. It looks like it always has because there are 8.3 billion of us, largely trying to steer our minute little bits of the world toward something better. Sometimes we even have succcess. It’s rather easy to look back and say “Well that was easy! We were always headed to where we ended up.”

When the year 2000 was looming, a low-key panic set in among certain groups of people: those distrustful of computer technology and its wielders, infrastructure planners who understood just how fragile things can be, and how every sufficiently complex system rapidly heads for uncharted territory and untried configurations. A princely sum was doled out to consulting firm after consulting firm, poring over often now-obscure legacy systems and patching in hacks to deal with the move from two digits to four, or from 99 to 00. Any given case is pretty simple, but the sum of embedded understanding of dates and times was quite an undertaking. A lot of code was fixed. Some of it was, to be certain, critical. The work was done, largely, and when the clock rolled over, a few people breathed a sigh of relief, a few more added another bit of justification of their distrust of computing technology, and the rest of the world, if they noticed at all, noticed the power was still on and that the party could continue. The millennium (or at least its penultimate year) came to a close with little fanfare, the efforts rewarded only quietly, and like all avoided catastrophes, with some complaints about the size of the bill because nothing happened and it really wasn’t such a big deal after all, was it? The retrospective view makes it so easy to ignore the actual toil that it takes to make the world work.

Of Centaurs and Centipedes

There is a term, from automated chess-playing systems, the “human centaur”. It refers to a person in control of playing the game, but using computer assistance to play. That was imported into automation theory, meaning a system where a human is in a position of authority and management of underlying automated processes. Cory Doctorow helped bring it from automation theory to specifically talking about AI systems but in particular, is calling out the reverse arrangement: humans as subjects of an inhuman and inhumane system that needs our fragile meat-fingers to do something in the real world for it.

I think in the cases where it’s not about our physicality, but our attention, there’s a very real risk of being hooked up to the metaphorical slop in an arrangement that resembles a partially-automation human centipede. A dehumanizing flood of low-value knowledge work. We know this happens because it already happens. Amazon’s “Mechanical Turk” has been doing this for ages. The “AI” boom is enabled by the collective trauma of Kenyan content moderators and image labelers. Almost all of this work is gig work. Almost all of it precarious. Almost all of it by outclasses that are distinguished by skin color and location, accidents of their situation, and the remnants of colonial force of the last century leaving a hard to change pattern in our world.

Charlie Stross said “Corporations are just slow AI”, and I largely stand by this: corporations are systems that are built to resist the will of their components (people!), full of controls and corrective systems to make a superhuman scale and often inhuman and inhumane behavior result. A person can’t move a million barrels of oil per day around the world, but a corporation can. It is capable of doing intelligence tasks a single person is not. It is a systemic harness for teamwork to achieve business goals, often despite what the members of those teams actually want. We’ve been reverse-centaured before, to the point of being ground to fine paste, and we will be again. This is certain. Our job is to mitigate that the best we can. We’re all doing our best, but it’s the coherent action of all of us that steers this ship.

The Dictation of Economic Forces

The economy, whether local, national or global, is a system like any other: in its normal operation, some parts are broken and most parts are working and working dominates the system. The broken stuff gets fixed mostly. It’s not perfectly efficient but it works.

Like any system, it can be perturbed into degenerate behavior. Wars can do this, bubbles can do this, exhausting resources can do this.

It’s really hard to tell this apart from a technological shift while it’s happening.

A thing you learn early in systems theory is that feedback loops dominate behavior. The orphan crushing machine is made of economic feedback loops, and turning it off requires political power sufficient to alter the system, or to dismantle its feedback loops into something better. Usually the latter is what happens, because building political consensus is very hard.

You can’t un-create technology. Genies won’t go back in the bottle.

All this is to say that resisting AI by pulling the reins in a general “AI is bad” way will in no way stop it, won’t even slow it down. This is not a meaningful way to change anything. Jumping in front of the orphan-grinding machine only adds a little adult human to the slurry.

Anything that makes us more productive is a competetive advantage, and in a system that allows winners to take all, we can follow suit or not. The feedback loops are swift and often decisive: adopt or die. While productivity is not a single dimension but complex, it is very easy to end up out-competed on a meaningful axis and lose in the marketplace. Would that we could restrain this system - we have in the past, but today our protections are outdated, weak, or absent. Strong antitrust law and imposing costs on players that get too big would do our world a heap of good. That is not, however, the world that many of us live in. Regulation and governance systems have not caught up to the invention of the multinational corporation nor the well-funded agile disruptor. We must plan accordingly and work to build and rebuild those capabilities. We have to talk openly about all of this, too, because factionalizing is yielding a powerless group with all the awareness of the harms, and a powerful group in a situation where winner can take all. If we work together we might just improve it. It won’t be easy.

That is not, however, to say that we are powerless.

The Crossroads

We stand at a crossroads. (We always stand at a crossroads.)

We can get in the driver’s seat and steer or we can ride in the back.

To go back to the story Manna - Two Views of Humanity’s Future, Marshall Brain lays out the choice that is before us, the choice that is always before us. At every moment we can be choosing between these things.

Will this technology put us in the centipedal position of consumers, subject to the whims of fascists at worst and technocrats who can only look at us as demographic statistics at best? Or will we take the reins and make sure that we are in a position to choose better?

In software work, right now, the sea-change has come.

Despite what widely cited but rarely read studies have said, productivity in many senses has gone up. This is not a linear process where making something more capable or efficient makes everything unambiguously better. New strains in all of our workplace systems have arrived with it. This is a moment of immense disruption.

Yes, the rush of playing with new technology has made many overestimate their real productivity gains. Yes, it’s actually hard to measure software teams outcomes. We do, however, have some figures coming out of the noise. It’s hard to argue with a startup generating a working and capable prototype expressing a use of their core idea every week to find product-market fit, and this is an everyday normal, not an outlier.

It’s hard to argue with people suddenly able to understand and summarize the problems in a large and complex legacy codebase in minutes. It’s hard to argue with consultants able to much more reliably generate estimates grounded in real facts for their work. In all the places it is possible to measure a definite outcome, real gains have happened.

This is not to say even remotely that this is unambiguously good. Persuing bad goals faster is in fact not a good thing.

We need to provide the goals.

Is the world one where we’re building systems to support abstract management goals, with no regard for the people in the system? Is it one where only “line go up” where the value gets captured?

Or is this a world where accessibility for websites is not just a thing we constantly have to remind, but a thing that happens nearly automatically, because it can be embedded in every prompt for the creation of software? Can we capture as much of this energy to benefit all of us as we can?

Competetive effects are still in play, but the cost of getting beyond serving the same 80% of people over and over has gone down. We can ask “what about X?” at a bunch of new places in the process.

The cost of reworking a plan to synthesize new ideas has gone down, dramatically. And if we’re the ones driving, we get to embed this, deeply.

Of course, the clay keeps growing.

Structures, high and wide

I’ve long had a weird relationship with the free software movement, because I’ve long believed it got caught up in its copyright systems hackery and stopped seeing the forest for the trees. There’s a reason the movement was coopted into free inputs for corporate development, mainstream foundations for hierarchy rather than liberatory structures.

I’ve always released most of my work under very permissive licenses, not because I agree with the “open source” movement’s sanitized goals, replacing protected freedoms with “anything allowed” openness, but because I think that the way to make liberatory software is structural, rather than tactical. Making something strong copyleft like using the GPL is a tactical decision: It’s to keep the software out of certain hands, usually, and even earlier, it was more liberatory, to keep control from being centralized. The ability to shift the locus of control of the software has always been a two-edged sword. Computing has changed shape significantly since the days of mainframes and recalcitrant printers. Not that it doesn’t rhyme, but the ways we locate control are different.

On the flip side, indie web experiments and some of the more fringe of the peer to peer software world have made strides toward collective ownership and development of important systems without getting coopted, not by copyright tactics but by structuring the software to communicate in collective-oriented ways. Nobody wants to steal your indieweb app for corporate uses because it is not built to support corporate goals, and never will be.

And the synthesis shows up in software like Tailscale, which to be clear is owned by a corporation (though as far as I can tell, one that’s behaving pretty nicely), and is friendly to open components that repurpose their work for novel uses. In addition, it’s spawned a massive growth of home-scale technologists building peer-to-peer home labs, software at personal scale that can still interoperate with the larger Internet, can participate in the world at large. Not all liberation is rebellion.

What we build matters: every thing that exists, supporting and supported by other things in a system starts forming the infrastructure for what comes next. The shape of it is remarkably persistent, because to change it you have to either let what depends on it fall with it, or account for it. There is a reason why Roman roads are now often motorways: it was far easier to build in place than to build something an entirely different shape. What we build now and ongoing and we make durable and connected will last.

The Soup

When we start working with software for LLM-based systems, we find that the ecossystem is a clash and synthesis of a bunch of disparate and often uncomfortable groups. Cryptocurrency anarcho-capitalists, intent on making the world into petty fiefdoms, early-bird-gets-the-rents structures have reset their sites on some parts of the AI world. Automated systems enforce code-enforced security systems and abuses on the world to reflect the shape of their beliefs, that if you are suffering in the system then it’s your own fault for not maximizing your gains and abusing the system like they do.

Fascists are absolutely here, vying for the power that manipulating media gives them and funnelling as much as they can into the misinformation machine, using the asymmetry of information warfare to distribute bullshit faster than democratic systems of verification, deliberation and accountability can correct it. They will be present in the glowing images and tasteless design of so much.

Technology is an amplifier for human ability. This is not a positive statement, nor negative, but a simple fact. This can be liberatory or this can serve a narrative of an increasingly resentful and demanding fascist populism. Fascism has always sought to capture technology and its processes and turn them to war and oppression.

Maybe most of all you will find technocrats. There is a deep and horrifying confusion of metrics for measures pervasive in the space: “evals” and “benchmarks” giving numeric quantities to qualities, stripping away context and only looking at numbers. They are to be resisted. Not to say all that work is useless, numbers usually indicate something. But the monomaniacal focus on numbers yields a complete ignorance of what cannot be easily measured. So too is talk of “alignment”: as if you can just tune a model on enough dimensions to be “correct”, as if alignment isn’t alignment with the ever-shifting and self-contradictory nature of human goals. I do not trust anyone who takes the “alignment” framing for “AI safety” seriously without critique.

The impact on work is where things are most immediately felt. In the Anglosphere, we have the pernicious, grim-hearted Calvinist views embedded in a lot of our culture: Any joy in relaxation is to be harnessed back into work until we are grimly productive. Idle hands are thought sinful, and this too is present in the moment.

There may be an economic pothole when one of the American AI vendors collapses. The financing of the big two, particularly OpenAI, is very strange. The deals are circular, there is certainly some part of all of this that is a bubble, and what gets caught in the blast when the music stops is anyone’s guess. However, there is no going back to a world before the invention of the LLM. On hardware that an average programmer might have in their house, it is now possible to run a useful model locally, if painfully slowly. It’s not going to change lives in that form but very normal processes of technological improvement will carry that well into permanently being true. One can run this in one’s living room if a little fan noise doesn’t bother you.

A bright thing in the soup is the theory of constraints. Creating source code in particular is much, much cheaper than it ever was. This does not mean software development necessarily goes faster: it means we find the next bottleneck.

That bottleneck is now human understanding and goal-deciding. That’s where we, humanity, come in. The systems cannot in fact take off without us entirely. Only if we refuse to steer will it careen over the cliff.

Where do we go from here?

We can do better than nothing at all. We can steer the chaotic change. Mandates to “Use AI” are everywhere in the corporate world right now. Where programmers have often but not universally adopted tools enthusiastically, not many domains are as well suited. The further from the core of easily-checked work of programming and mathematics and into the messy world of relationships and context and embedded and embodied process, the worse “AI” tools perform by default. Mandates rarely capture true value, because they are anathema to the care required to adopt these tools well. This means there is waste we can hide good things in.

We can entreat our bosses to remember that glue work is not easily replaced, and that we have no excuse not to do the important, we can organize, and we can choose where to spend tokens, and often choose which models to use them on.

If you’re mandated to use AI tools at work, which is better? Burning a million tokens on Claude Opus 4.7, creating a deluge of source code to inflict on your coworkers that they cannot meaningfully review and only stamp “Looks good to me” or reject? Or is it better to pick a small model and a small but important task that wasn’t going to get done anyway and do it, cheaply and quietly?

We can use LLMs to mitigate other social harms. Not all of them, of course, but we can direct where some of this attention goes.

“I don’t have time for accessibility” is not true anymore. “I don’t have time to write tests” is not true. “I don’t have time to write documentation” is not just untrue, but your job now is to mold the documentation into something actually useful, carving out and humanizing rather than adding to. “I don’t have time to consider the impact on any of the disadvantaged groups I know about” is also now patently untrue: it’s five words in a prompt and then dealing with what you learn. It won’t always be right, there’s bias out there but the systemic excuse of there not being time is no longer true.

We can refuse to use LLMs to communicate with each other. Every email, every word of this essay, every text message I send is hand created and I urge you to pledge the same. I do not use image generators, I want no part in generated videos. They must be soundly rejected. Where software has some artistry at time, usually in abstract ways, the object we create is first and foremost understanding. It is natural to use models of the world in making models. Art, however, is communication, and communication is humanity. When we delegate the very things that let us relate, that is hostile to the very nature of being human.

I think there is one important place where we can, and in fact, should use these tools, with care and attention and with as much honesty that it has sharp edges and harms possible. We can now, very very simply, communicate with people who do not share a language with us. Imperfectly. Mediated by a machine that does not grasp the nuances of language and our context. However, machine translation enables an entirely new kind and scale of human connection across language boundaries. A particularly clear example is that during the first TikTok ban in the US, a bunch of Americans fled to 小红书 (Xiaohonghshu) for their TikTok shaped video fix, and were greeted first by a few Chinese-speakers who also spoke English, but then by a hastily but honestly impressively created bidirectional machine translation system that enabled a truly fascinating and wonderful exchange between two cultures that are usually isolated not just by the Great Firewall but by a language barrier. That system is still in place and you can try it yourself. We can learn a lot from this. I may go into detail of the pitfalls and wonders and things this can enable, but not in this already very long essay.

We can refuse to generate the cute image for a slide at work and remember and learn how to be funny or droll or apt or illustrative on our own. We can hire artists to make us an image. We can share our as-yet amateur attempts. We can hire others to do it too, when it matters.

It is possible to create artistic meaning using AI tools. However, the path from there to delegating our humanity is painfully short. I am quite comfortable rounding this to a simple idea: Do not. We must protect our ability to both do and to sustain art, and generated images and communication is perhaps an even larger threat than the corporate contentization and eternal copyright of it ever was. We should not live to work, but when we work to live, we must have something to live for. This is it.

We must become acutely sensitive to interposition into our human nature to communicate and connect with each other. This is not just about “AI”, though that’s particularly salient now, but all the interposing between us that companies do should be soundly rejected. Personalized feeds, algorithmic curation. I even loathe recommender algorithms, and suggest you do too. I find no joy in a computer telling me I may also like something, given my taste. But from people who see me for who I am? That’s gold.

In the end, we have the choice to make. Every time someone suggests something to deskill, destroy knowledge, or control communication, we should use the tools at hand to bend that to our collective benefit instead. Every time we use one of these tools, we must ask ourselves what human relationships it is replacing, and when the answer is uncomfortable, we should seek to strengthen those relationships instead. We are now more than ever needing to find collective sense-making, which means a new openness and vulnerability in talking to each other is required. This will not be easy.

Let’s get to work.

An AI haters guide to code with LLMs (The How-to)

This is the part of the post I started to write in part 1 but got side-tracked with the context-setting.

I have a terminal-centered workflow and probably always will. At the end of the day we’re working with text and there are a zillion good tools for text there. I cut my teeth on SunOS and it shows.

Bottom line, up front

This is just one way to do it.

Install opencode.

Get an account with z.ai and pay for the basic coding plan.

Clone or initialize a git repository. Run opencode. Everything assumes you run the command in the root of a repository.

/connect and select the Z.AI Coding Plan option. Enter your API key when prompted.

Hit tab to switch to Plan mode.

Type /models and select GLM-4.7 or GLM-5. You generally want a high end model for planning.

Tell it what you want. A bite-size piece. “Set up a project using $language with $framework”. If you’ve managed folks fresh out of a code boot-camp, you can assume they know the basics of how to start a project in the thing they were trained on. This is like that. Except it embeds the knowledge of every basic tutorial and starting guide for everything written before 2025.

Look at the plan. Answer questions it has. Stop it with the escape key and say “Actually I want X” when it goes a weird place. Interrupt. Read the plan. Read the plan.

Look at the context window fill in the top of the opencode TUI. If you’re up at 70% you may well have overfilled it.

If so, tell it to save the plan to a file like TODO.md. (make sure it does), start a new session with /new, and then back in plan mode say “Start with the plan in TODO.md”. Hopefully your context is emptier now.

If the plan looks sensible, hit tab to switch back to Build mode. Type /models and select GLM-4.7-Flash, unless the task seems complex to you, and will need reasoning decisions that depend on each other to be successful.

Say go or something affirmative. It will go.

Watch it a bit. It may go off the rails. Stop it and say “not like that, do X instead”.

Repeat. Start new sessions frequently, but know that a new session is a fresh-faced developer fresh out of boot camp who’s eager to please but has never seen your project before. Save your intro to the project in AGENTS.md in short declarative sentences, or instruct the tool to “Add a line to AGENTS.md about where test files are located” is a reasonable instruction.

Let’s delve in understand more deeply what’s going on.

Every code tool is a control loop. It’s not exactly a REPL, so it may give a bit of cognitive dissonance to some of us who are used to that kind of dialogue with a computer.

Instead of read, evaluate, print result, loop, where the result is inline and we’re managing the system, this is instead side-effectful, and we’re providing signals to a control loop that is fuzzily trying to approach the goal we give it.

Fundamentally LLMs give “completions” to the prompt. This is the aspect that people rightly call “fancy autocomplete”. When you open the tool like opencode, it prepares its system prompt. For opencode specifically, it has different modes (that it calls “agents”), so the prompt is different depending on whether you use plan mode or a session that started in plan and switched to build.

After the prompt, the tool sends the text describing what tools are available. This is an area of active research and design: the context window is precious, and making models with larger ones makes them more expensive to run. In early 2025, a long context window was 200,000 tokens. Now million-token context windows exist in the most expensive models. However: more context means slower responses, more cost, and earlier hitting rate limits. Remember that compute power is the contended resource right now, especially given the environmental concerns of data-center build out. We want to fit as much useful work as we can in the context window of smaller, faster models. Right now, every tool you add to a system has to be described to the model. That adds up. I’ll get into that more below.

All in all, the base prompt may amount to 15000 tokens. The AGENTS.md file, if you have one (and you probably should) adds however many more tokens. You can eat up a context window quickly. this is one of the fundamental difficulties of working with these tools: the context is precious, but it’s also much slower and more expensive to make multiple runs to load pieces into the context. Different models and tools take different tactics to manage the scarce resource.

Prompting

Different models were trained in different ways with respect to meta-tokens with system meaning, so how you write a prompt with absolute admonitions and precise meanings varies by model. Opencode uses this prompt for Gemini and this one for Copilot with GPT-5. There’s none for Z.AI’s coding plan GLM-4.7 or GLM-5 and we get the raw model’s output, and if we wanted to instruct it more like the others, we’d have to write that. I get great output from GLM-4.7 without doing so, but I may in fact be able to hone it a bit if I did. Other models like Mistral and Llama have somewhat different tokens for system concerns. Having a tool that doesn’t make actively hostile to your model prompts is a boon, and one of the reasons I use opencode. It’s also a reason that Claude Code and the Claude models are well matched: being produced by the same company means that they’re aligned with each other and going to produce better output by default. I expect Codex and GPT-5 are well aligned similarly, though I’ve not tried it. If someone wanted to use a Llama-derived coding model, they might well need to rewrite some system prompting to work better on that model. This is the murky statistical and stochastic work that a lot of AI companies are doing to tune things, and the subject of much AI research. How do we get these probabilistic machines we’ve built to give stronger guarantees?

Something you might note from the plan prompt for opencode is that the only thing keeping the model from pulling a sudo rm -rf / is that nobody asked for it, and the system prompt said not to. There is no actual restriction preventing it except that sudo may ask for a password.

Let me underscore this. Most code tools have no actual guard against touching the wrong files. If they ask permission that is a courtesy the model generated and the tool offered the model.

Security, and the complete lack thereof

Models can and will find clever workarounds while seeking the goal you provide. If it hits a bump — which may be imagined: generating the wrong URL, getting 404, and deciding something doesn’t exist, and therefore it needs to do it with a generated shell script is not unexpected behavior here — it may well treat the rules as obstacles to work around. When AI researchers talk about “alignment”, this is a small part of what they mean. (What they mean tends to be slippery and context-dependent, and also often marketing, but alignment research is at least nominally about how to get models to behave well.)

Training for capability and training for rule-following are somewhat antithetical to each other. There is no single correct balance. This is one of the things that makes these tools dangerous at many levels.

Under no circumstance are these tools truly safe. Tools for managing risk must be an everyday part of your thought process and toolkit.

Let me emphasize that differently:

It takes an immense effort to turn these systems into constrained, safe, and predictable systems, and that is in tension with their usefulness. Risk management rather than risk elimination is almost certainly the mindset to take, but that transforms it from a nice set once and forget property of the system and turns it into an ongoing concern.

Some tools do better than others: OpenCode’s security posture is extremely minimal. Claude Code is probably the most sophisticated with in-band permission checks. Codex actually has a sandbox to some degree.

But in the end, these tools tend to look at the first word in a command being run by the “bash tool” at most and decide whether to prompt for permission based on that. They look at the path of files being read with the “read file” tool and if they’re outside the project directory. The model can decide to read files with the bash tool. It’s perfectly possible for it to decide what it needs to do to read that file is call npx -p shelljs cat /etc/secrets. The tools are clever, especially the frontier models. Actually running them safely involves rigorous DevOps skills like containerization, virtual machines, and automated orchestration.

Thinking

Modern code and reasoning models actually internally emit self-antagonistic streams of tokens, which act as a sort of threshold gating for making decisions: if a model “thinks” you want one thing, the other part of the model may “think” what might be wrong with that and it will go back and forth a bit until something internally stops it - there are thresholds for how much thinking to do and the expense of running that process - and then continue with the main stream of output. The back-and-forth looks a lot like what some people report their internal narrative to be, though it’s not actually reasoning with mathematical logic like we expect computers to do. It’s doing analogical reasoning: this looks like what comes next. This goes remarkably far: it turns out that if you’re trained on a lot of logical processes, the system can analogize to that and get pretty far, in new and surprising ways. Frontier models manage to do ground-breaking math. Anything with a hard “this is correct” test tends to be something you can throw more and more self-antagonistic model “thinking” at and get decent results. It breaks down, but with the test for “that’s not right” it can go back and try again until it gets it. This is very powerful and one of the fundamental tricks of what’s going on. For something doing sloppy, analogical reasoning, just doing a superhuman amount of it looks like real logic. Combine that with the actual machine logic of testing an assertion, and you get a very powerful if unpredictable tool.

Feedback & Cybernetics

Here we get to the core of the system: at its heart, building software with these systems is operating a control loop. Your instructions, the prompts, checks inserted by tools, the additional information surfaced by tools, and your corrections and interruption form a regulatory feedback loop. The models are trained to seek the prompted outcome as a goal. Everything else is modulating that with positive and negative feedback, and then continuing the loop.

The loop is essentially “while not done, try to make progress toward the goal”.

Sometimes the loop has to be stopped for input - the model can ask a question or assert that it’s gotten to something near enough the goal to be “done” in some way. Sometimes the loop is stopped because of a system failure. In a “multi-agent” set-up, the loop may be stopped by a superior monitoring “agent” and corrected and started again, or stopped entirely and abandoned.

Fundamentally we are part of the loop unless we decided to set up a system to delegate that. This is what cybernetics actually means: control loops. Sometimes as simple as a thermostat (“it’s too cold, turn on the heat. Loop.” “Now it’s warm enough, turn it off. Loop.”), sometimes vastly complex as in networks of sensors and actuators. You can look at the economy as a cybernetic system too, with money and prices being the information and influences flowing around and altering the “loop” of “how can we make more money today?” and “how can we serve human needs today with the resources we have?”

We have started a second golden age of cybernetics. Hopefully we can keep it from being a paperclip money maximizing machine at the expense of everything else.

MCP servers

Almost all code tools support “MCP servers”: they are roughly an API meant for LLM consumption rather than human or classical program consumption. Every tool offered by an MCP server usually gets added to the context, in a way resembling “Use the frob tool when you need to frob a file. It expects the file name to frobnicate, and optionally the date”. This is roughly equivalent to an API offering “call frob(file, [date]) to frobnicate a file”. In the world of LLMs though, inputs and outputs can both be a bit fuzzy and still work, depending on the context. Models will call MCP servers by emitting special tokens for calling tools, with the arguments inside.

Tools

Most code tools also suggest things similar to MCP servers for internal tools - “read file” “bash” and “edit file” tools are the heart of code tools. A basic code tool is those three features bolted to a while loop and an input text box. They can be very simple in essence. The complexity is somewhat an emergent property of a very simple loop with complex and rich variation.

Skills

Code tools are starting to support a concept called “Skills” which are markdown documents describing how and how not to do something. The markdown documents are marked with enough context to know when to read them, so the descriptions of skills end up in the context, but the specific instructions aren’t read until the skill is needed, so it’s a way of breaking up a set of prompts into pieces so they don’t overwhelm the context window.

Code Mode

In addition to directly emitting tool call tokens, it’s increasingly popular to have the LLM emit a script and use that: they’re good at writing code, but calling a tool, reading the response into context, then writing it back out into another tool is slow and expensive. People are building systems that instead tell the LLM to just write a script, passing outputs of one tool into the inputs of another. There are tools like port of context and mcporter that expose MCP servers as callable CLIs or APIs, so using a more classical programming interface to the new world MCP server and tool definitions. It’s a lot more context-efficient in a lot of cases, so it’s a worthy technique.

It’s called “code mode” but it’s not a mode in the sense of separate and mutually exclusive, but mode as in “way of doing things”. It is a mode of operation, not a setting you set.

Context compaction

Tool calls end up with the inputs and outputs in the context window. Claude Code and Codex and some of the more mainstream AI company tools try to manage the context for you, and so will “compact” if the window gets full, using the LLM to summarize the session so far, and replace the whole transcript with the summary, so it fits in the window. It’s a useful technique, and controlling it and doing it well is a place future research will bear fruit.

For opencode, there’s a /compact command that does this summarization, and you can invoke it when the context window is getting full.

Compaction erases the specifics of what was going on, so you want to avoid interrupting a fine reasoning process where the details matter to the future actions, but it is sometimes unavoidable.

There is a dynamic context pruning plugin for opencode that’s quite good: it provides a tool and some manual / commands to prune useless information out of the context, so both you as the operator and the model can decide to call it and prune things; in particular it removes file write outputs from the history when there’s a read of that file later. There’s no point in that clutter being in the context unless the tool has a reason to be aware of its own process in that kind of detail. That is basically never true. The plugin author has future ambitions to make it smarter and more capable.

A warning however: some providers (Anthropic and OpenAI both, among others) cache prompts, and charge a lot less for tokens that have been processed before. This is a prefix, so instead of paying for the whole chat for each piece added, you pay full price for only the added tokens. This means that compaction, by changing earlier parts of the chats, breaks the cache at the point of its first edit. There is a trade-off here, especially if you’re using pay-as-you-go pricing.

Meta-tooling & Sub-agents

In the scale of code tool use, from “pasting snippets of code into chat” to “hands off orchestration of complete multi-day workflows”, what I’ve described so far is somewhere closer to the first. So far I’ve suggested single-stream chat loops making edits to files.

Tools like opencode do a little bit more - they are configured with some hidden alternate prompts and the top level chat will actually call these “sub-agents” like they’re tools, so with a fresh context it will do some more specific task, faster and cheaper and better because it’s not trying to make it consistent with the greater chat context, just its own instructions. This is the tip of a very large iceberg.

To do large-scale work, where the context would be used up by the history or the scope of the plan, and with the tendency for models to go off the rails, they add more layers of control systems on top: have one model monitor another, feeding it commands and instructions or admonishing it when it goes off the rails. Like the thinking is internally, people have applied antagonistic counter-processes in a feedback loop to get bigger tasks done, at the cost of burning a lot of compute time. The most reckless of these might be Gastown, but the general pattern is sound even if a giant heap of vibe-coded slop doing it with negative safeties by someone caught up in the addictive power of controlling this arcane machine maybe isn’t a great example.

OpenCode’s simple sub-agents are a good start. Some people have tools to open up a bunch of tmux windows with separate instances of opencode; some people wire up plugins or hooks to Claude Code to whack Claude and tell it to keep going when it doesn’t create the tests you asked it to. (Claude and lots of other models will mirror the human training data: 100% coverage is hard, and what if we just stopped? A small tweak to the goal makes a bunch less effort…) - since these systems necessarily have to treat the human prompt input as fuzzy, not exact, this makes some sense. It’s not good behavior, but it is understandable.

The more hands off the person is on any of the control loops, the more can go awry from personal expectations, burning tokens, time and credibility – it may not necessarily, but this is a system that can keep running, adapting to situations pretty readily in pursuit of its goal. We have not figured out all the patterns for controls.

Languages that bring success

Not all languages are as friendly for LLM use as others. In general, you find that the more definite a language is about what’s right and wrong, the easier it is for the system to learn. In addition, what was popular in the training data - both patterns and languages - will be more readily replicated than things that are novel since the model was trained, or obscure.

Languages that allow monkey-patching may end up with the LLM taking some dangerous shortcuts, altering the runtime to suit a task, but breaking it for others. Python is very popular in the training data so it’s generally pretty okay, but even so, the tools have no problem coming up with bad ideas for how to modify python inline to make a task easier as it seeks its goal.

Languages with strong type systems, even if not sound ones like typescript, rust, java, and C# will tend to do very well. The patterns are strong and definite and the tools will struggle with them less. Haskell works well, too.

Dynamic languages and obscure ones will present so many more troubles: I fully expect the LLMs to trip over perl, Smalltalk and LISP quite a lot.

Go is pretty phenomenal, there’s so much out there, and it’s all quite uniform and not complex code. Its complexity matches what LLMs are good at without deep thought so it can truly churn out some mind-bending amounts of code, for better and worse.

Build Guard-rails

Add more tests. Get the LLM to write tests, and to suggest tests to write. The effort required for testing is entirely different now: we have to read the tests and think about them just as much but that’s most of what we have to do. The actual implementation of them can be mostly mechanical with just verification that they actually assert what they test as most of the review.

Add linters if you want the code to look a certain way beyond parsing or compiling.

Add guidelines to follow. Add explicit demands of the LLM in AGENTS.md or similar documents and demand that it follow them. Use exceedingly strong language to do so. You could do worse than RFC 2119 / 8174 / BCP 14 keywords.

These are not absolute gates, but they absolutely help regulate the control loop to stay in bounds. When something screws up, add more instruction to skills or AGENTS.md. Have the LLM write code to enforce the future.

Eyes to the future

I’m going to write later about some philosophies I have toward these tools, but for now I hope this was a more practical essay that lets you operate a code tool with an LLM with some understanding of how they actually work, rather than treating them as a fickle oracle. They are tools and can be understood, and the usual ways of doing that work - except for the sometimes inhuman scales that we can now trivially create. It’s entirely possible to overwhelm our senses and sensibilities and that will lead us to some confusing and dangerous places.

Take care.

The AI hater's guide to code with LLMs (The Overview)

Introduction

This is the post I don’t think many people expected me to write, and I have (rightly!) surrounded myself with people who are generally uncomfortable to somewhat hostile to “AI”, mostly for good reasons, though I’ll get into the many caveats on that as I go.

As activists mitigating the harms of “AI”, we need to be well informed, and we need to understand what the specific harms are. Treating it with a hands-clean purist mindset will be extremely difficult and as activism, more alienating than effective. These are genuinely useful tools, and pretending they aren’t will not in fact win many hearts and minds.

This post is going to be very long, because in addition to technical context, I’m touching social issues, technical background, discourse norms, context in a culture of rising technocracy and fascism funded by venture capital, and the erosion of our information systems and cultural norms all at once. I can’t get into it all here, but I am not staying away from it on purpose.

Overall, I still believe that LLMs are a net negative on humanity, that the destruction of our infosphere is going to have generational consequences, and that if the whole thing disappeared from the face of the earth tomorrow, I wouldn’t be sad. The damage is would still be out there, but the cheapness of bullshit pervading everything would at least resume being human content mill scale. Not to say that that was good before LLMs came along and made it this bad, but it was better.

That said, that’s not going to happen, and the amount of effort required to make it happen would be much better spent on organizing labor and climate action. The AI industry may collapse in a house of cards. I think it somewhat likely considering amount of financial trickery these companies are using. But as someone I know put it: we’re not just going to forget that computers can write code now. We aren’t.

I want you to think about all of this with an intensely skeptical mind. Not hostile, mind you, but skeptical. Every claim someone makes may well be checkable. You can check! I recommend you do so. My math in this essay will be rough back of envelope calculation, but I think that is appropriate given the tendency of the costs of technology to change orders of magnitude, and situationally for things to vary by at least a factor of two.

And since we’re both operating in the domain of things not long ago considered science fiction, and because the leadership of AI companies tend to be filled with people with a love of science fiction, many of whom won’t hesitate to, as is said, create the Torment Nexus from the popular science fiction novel Don’t Create The Torment Nexus, I suggest one story to read and keep in mind: Marshall Brain’s “Manna – Two Views of Humanity’s Future”.

TL;DR

  • There are open models and closed; good code work needs to be done on stuff that needs very high end hardware to run, at least in part.
  • Chinese models are quite good, and structured differently as companies.
  • Don’t bother running models on your own hardware at home to write code unless you’re a weird offline-first free software zealot. I kind of am, and still I see the futility with the hardware I have on hand.
  • Nobody agrees on the right way to do things.
  • Everyone is selling something. Usually a grand vision hand-waving the hard and bad parts.
  • I’ll write more about how to actually use the tools in another segment.
  • A lot of the people writing about this stuff are either executives who want to do layoffs, or now-rich people who made it big in some company’s IPO. Take what they say with the grain of salt you’d use for someone insulated by money and who can have free time relatively easily. They are absolutely hand-waving over impacts they themselves will not experience.

A note on terms

I am writing this with as much verbal precision as I can muster. I loathe terms like “Vibe Code”, and in general I am not jumping on any marketing waves and hype trains. I’m being specifically conservative in the words I use. I say LLM, not “AI”, when talking about the text generation models at the heart of most of the “AI” explosion. I’ll prefer technical terms to marketing buzzwords the whole way through, even at the cost of being awkward and definitely a little stodgy. Useful precision beats vacuous true statements every time, and the difference now very much matters.

The Models

There are a zillion models out there. Generally the latest and greatest models by the most aggressive companies are called “frontier” models, and they are quite capable. The specific sizes and architectures are somewhat treated as trade secrets, at least among the American companies, so things like power required to operate them and the kind of equipment required is the sort of things analysts in the tech press breathe raggedly over.

the American frontier models include

  • Anthropic’s “Claude Opus”
  • OpenAI’s GPT-5.2
  • Google Gemini 3 Pro
  • something racist from xAI called Grok.

The frontier models are a moving target as they’re always the most sophisticated things each company can put forth as a product, and quite often they’re very expensive to run. Most of the companies have tools that cleverly choose cheap models for easy things and the expensive models for difficult things. Remember this when evaluating anything resembling a benchmark: it’s an easy place to play sleight of hand.

When you use a frontier model company’s products, most of the time you interact with a mix of models. This is usually a somewhat cheaper to run version of the frontier models as the main mode, sometimes offering the true best model as an option, a thing that is sometimes invoked, and the whole thing is hidden behind a façade that makes it all look the same. Version numbers often resemble cell phone marketing, with a race to have bigger numbers, “X” and “v” in places to make it seem exciting. There is no linear progression nor comparison of any of the numbers in the names of models or products.

I largely have no interest in interacting with the American frontier model companies, as their approach is somewhat to dominate the industry and burn the world doing it. Anthropic is certainly the best of the bunch but I really don’t want to play their games.

I do not know this for sure, but I expect these models run into the terabytes of weights, more than a trillion parameters, plus they are products with a lot of attached software — tools they can invoke, memory and databases and user profiles fed into the system.

Behind them are the large models from other AI companies, largely Chinese, producing research models that they and others operate as services, and often they are released openly (called “open weights models”). Additionally some of the frontier model companies will release research models for various purposes. All core AI companies pretty much style themselves as research organizations first, and product companies second. Note that nearly every AI company calls its best model a frontier model, whether it fits with the above or not.

Chinese companies and therefore models often have a drive for efficiency that the American ones do not. They are not the same kind of market-dominating monopolist-oriented sorts that VC-funded American companies are. They aren’t as capable, but they do more with less. They’re very pragmatic in their approach compared to the science fiction fueled leadership of American AI companies. These models run in the hundreds of gigabytes and have hundreds of billion of parameters, though most can be tweaked to run some parts in a GPU and the rest on a CPU in main memory, if slowly. They can run on regular PC hardware, if extremely high end hardware, and distillations and quantizations of these models, while they lose some fidelity, fit on even more approachable hardware. Still larger than most people own, but these are not strictly data-center-only beasts.

Large, capable open models (Mostly Chinese) include:

  • z.AI’s GLM-4.7 and GLM-5
  • Kimi K2.5
  • MiniMax M2.1
  • DeepSeek-V3.2
  • Alibaba’s Qwen3-Max
  • Mistral Large 3
  • Trinity Large

Mistral Large 3 comes out of Europe. Trinity comes out of the US, but has a less “win the AI race” mindset. There’s a lot of superpower “We need our own sovereign solution” going on. China, the US and Europe are all making sure they have a slice of the AI pie.

I’m sure there’s more — the field is ever changing, and information about the models from Chinese companies percolates slowly compared to the American frontier models.

Behind these models are specialized smaller models, often sort-of good for code writing tasks if one isn’t challenging them, but I actually think this is where the line of usefulness is drawn.

Medium-small coding models include:

  • Qwen2.5-Coder
  • GPT-OSS 120b
  • Mistral’s Codestral
  • GPT-4.7-Flash
  • Claude Haiku
  • Gemini 2.5 Coder
  • Smaller versions of Qwen3
  • Smaller versions of many other models

There’s also some much smaller models that will run on large gaming GPUs. I don’t think they’re quite useful, they’re very attractive toys that people can get to do some truly impressive things, but I don’t think they’re all that. They are, however, about the capability of what knee-jerk AI-haters expect, error-prone lossy toys that if anyone called “the future”, I’d laugh in their face or spit at their feet. Notice how far down the list this is.

The Economics

LLMs are expensive pieces of software to run. Full stop, anything with broad utility is something that requires a GPU greater than most high end gaming PCs, and quite a lot of RAM. I am setting a high bar here for utility, because AI boosters tend to have a frustrating way of equivocating, showing low numbers for costs when it suits them, and high ones for performance, despite not being from the same models. There are domain specific tasks and models that can work in mere small GPU or even Raspberry Pi levels of computation, but for general purpose “reasoning” tasks and coding specifically, right now in 2026, with current model efficiencies, and with current hardware, if you want to use LLMs for writing software, you will be throwing a lot of computing power at it. A $5000 budget would barely suffice to run something like gpt-oss 120b (OpenAI’s open model that is okay at code-writing tasks). Additionally, if you kept the model busy 100% of the time, you might be talking $50-$200 in electricity depending on local prices, per month.

If you spent $15,000 and triple the electricity you could run something like GLM-4.7 at a really good pace.

Water cooling for data centers is probably the most talked about environmental cost, but I think it’s actually a distraction most of the time. Dear god why do people build data centers in Arizona, that’s a travesty, but also that’s a specific decision made by specific people with names and addresses who should be protested specifically.

Data-center growth at the cost of people driving up electricity demand is a big problem, and we need to get back on the solar train as fast as possible.

This is not inexpensive software to run. However, it’s not an unfathomable amount of power.

Training models is wildly expensive, but it amortizes. There are in fact difficult economic conversations we need to be having here, but it’s all obscured by the fog of “what about the water?” and “AI will save us all and change everything!” that pervades the discourse. The framing of the arguments at large are fundamentally misleading, by basically everyone, pro or anti-AI, and much more about affiliative rhetoric than argumentative. We need to have the arguments, and actually look for and persuade people of the truths. They’re uncomfortable so I fully understand why we’re not very often, but if we want to actually solve crises, we need to talk with actual truths in mind.

With prices of $200/month for “Max” plans, if one uses the tools well, a company would in fact be making a smart decision to get their developers using them. They are definitely below cost, probably by at least 3-5x. Maybe 10x. (Remember that a price shock will come at some point before depending on the economics of these systems in existential ways for a business.)

Even at cost the math works out for a great many use cases.

Light plans are $20/month, and I think that for intermittent use, with good time sharing, that’s quite sustainable. In my experimentation I’m paying even less than that, and while I don’t think those prices will be sustained, I don’t think they’re impossible either.

Most of the big providers and almost all of the hosted open model providers have a pay-by-the-token API option. This is an unpackaged a-la-carte offering, in the style of cloud providers. They nickle and dime you. The model while transparent is hard to calculate. The usual rates are in prices per million input tokens and per million output tokens. Input tokens are cheaper, but interactions with tools will re-send them over and over so you get charged for them multiple times. Output tokens are more expensive but closer to one-time things. Expensive models can be $25 per million output tokens and $5 per million input tokens (Claude Opus 4.6). I expect this reflects a decent margin on the true costs, but I have not a ton to back this expectation up. Most open models run in the realm of $0.50-$3 per million input tokens and $1-$5 per million output tokens. Given that a lot of the open models are run by companies with no other business than running models, I expect these represent near true financial costs. There’s no other business nor investment to hide any complexity in.

The Tools

Most of the tools can talk to most of the models in some way. Usually each has a preferred model provider, and doing anything else will be a lesson in configuration files and API keys. Some more so than others.

Most of the tools are roughly as secure as running some curl | bash command. They kinda try to mitigate the damage that could happen, but not completely, and it’s a losing battle with fundamentally insecure techniques. Keep this in mind. There are ways to mitigate it (do everything in containers) but you will need to be quite competent with at least Docker to make that happen. I have not, I’m going for being a micromanaging busybody and not using anything resembling “YOLO mode”. I also back everything up and am not giving permission to write to remote repos, just local directories.

I know terminal-based tools more than IDEs, though I’ll touch on IDE-integrated things a bit. I haven’t used any web-based tools. I grew up in terminals and that’s kinda my jam.

  • Claude Code is widely acknowledged as best in class, has a relatively good permission model, and lots of tools hook into it. It’s the only tool Anthropic allows with their basic consumer subscriptions. If you want to use other tools with Claude, you have to pay by the token. Can use other models, but it’s a bit of a fight, and a lot of APIs don’t support Anthropic’s “Messages” API yet.
  • OpenAI Codex is OpenAI’s tooling. It’s got decent sandboxing, so that what the model suggests to run can’t escape and trash your system nearly so easily. It’s not perfect but it’s quite a bit better than the rest. It’s a bit of a fight to use other models.
  • OpenCode touts itself as open source, when in reality most stuff is. It’s a bit less “please use my company’s models” than most tools, and it’s the tool I’ve had the best luck with. It has two modes — Build and Plan — and using them both is definitely a key to using the tool well. Plan mode creates documents and written plans. Build does everything else and actually changes files on disk.
  • Kilo Code is both a plugin for VS Code, and a tool in the terminal. It has not just two modes but five, and more can be customized. “Code”, “Architect”, “Ask”, “Debug”, and “Orchestrator”. Orchestrator mode is interesting in that it’s using one stream of processing with one set of prompts to evaluate the output of other modes. This should allow more complex tasks without failing because there’s a level of oversight. I’ve not used this yet, but I will be experimenting more. Its permission model is pretty laughable but at least it starts out asking you if it can run commands instead of just doing it.
  • Charmbracelet Crush is aesthetically cute but also infuriating, and it’s very insistent on advertising itself in commit messages. I’ve not yet seen if I can make it stop, but it did make me switch back to OpenCode.
  • Cursor — App and terminal tool. Requires an account and using their models at least in part, though you can bring your own key to use models through other services.
  • Cline — Requires an account. IDE plugins and terminal tools.
  • TRAE — IDE with orchestration features. Intended to let it run at tasks autonomously. I’ve not used it.
  • Factory Droid. Requires an account. Can bring your own key.
  • Zed. IDE editor, with support for a lot of providers and models.

TL;DR

I like OpenCode; Kilo Code and Charmbracelet Crush are runners-up. The (textual) user interface is decent in all three, and it’s not loud, it’s not fancy, but it’s pretty capable. At some point I’ll try orchestration and then maybe Kilo Code will win the day. You’re not stuck with just one tool either.

Antagonistic Structures and The Blurry JPEG of the Internet

At its core, you can think of LLMs as extremely tight lossy data compression. The idea that it is a “blurry JPEG of the internet” is not wrong in kind, though in scope it understates it. Data compression is essentially predicting what’s next, and that’s exactly what LLMs do. Very different specifics, but in the end, small bits of stuff go in, large outputs come out. It’s also “fancy autocomplete”, but that too undersells it because when you apply antagonistic independent chains of thought on top, you get some much more useful emergent behavior.

A pattern that you have to internalize is that while lots of these tools and models are sloppy and error-prone, anything you can do to antagonize that into being better will be helpful. This is the thing where I show you how LLM code tools can be a boon to an engineer who wants to do things well. Suddenly, we have a clear technical reason to document everything, to use a clear type system, to clarify things with schemas and plans, to communicate technical direction before we’re in the weeds of editing code. All the things that developers are structurally pushed to do less of, even though they’re always a net win, are rewarded.

You will want your LLM-aided code to be heavily tested. You will want data formats fully described. You will want every library you use to have accurate documentation. You will use it. Your tools will use it.

You will want linters. You will want formatters. Type systems help.

This pattern goes deep, too. Things like Kilo Code’s “Orchestrator” mode and some of Claude Code’s features work as antagonistic checks on other models. When one model says “I created the code and all the tests pass” by deleting all the failing tests, the other model which is instructed to be critical will say “no, put that back, try again”.

One of the big advances in models was ‘reasoning’ which is internally a similar thing: If you make a request, the model is no longer simply completing what you prompted, but instead having several internal chains of thought approaching it critically, and then when some threshold is met, continuing on completing from there. All the useful coding models are reasoning models. The model internally antagonizes itself until it produces something somewhat sensible. Repeat as needed to get good results.

Even then, with enough runtime, Claude will decide that the best path forward is to silence failing tests, turn off formatters, or put in comments saying //implement later for things that aren’t enforced. Writing code with these tools is very much a management task. It’s not managing people, but sometimes you will be tempted to think so.

The Conservative Pressure

So here’s the thing about LLMs. They’re really expensive to train.

There’s two phases: “pre-training” (which is really more building the raw model, it’s most of training), and “post-training” (tailoring a general model into one for certain kind of tasks).

Models learn things like ‘words’ and ‘grammar’ in pre-training, along with embedded, fuzzy knowledge of most things in their training set.

Post-training can sort-of add more knowledge, giving it a refresher course in what happened since it came out. There’s always a lag, too. It takes time to train models.

The thing is though that the models really do mostly know only about what they were trained on. Any newer information almost certainly comes from searches the model and tools together do, and stuffs into the context window of the current session, but it doesn’t know anything, really.

The hottest new web framework of 2027 will not in fact be new, because the models don’t know about it and won’t write code for it.

Technology you can invent from first principles will work fine. Technology that existed and was popular in 2025 will be pretty solid. Something novel or niche, code generation will go of the rails much more easily without a lot of tooling.

This is, in the case of front-end frameworks, maybe a positive development in that the treadmill of new frameworks is a long-hated feature of a difficult to simplify problem space for building things that real people touch.

In general however, it will be a force for conservatism in technology. Expect everything to be written in boring, as broken as it ever was in 2025 ways for a while here.

They’re Making Bullets in Gastown

There’s a sliding scale of LLM tools, with chats on one end and full orchestration systems of independent streams of work being managed by yet more LLM streams of work at the other. The most infamous of this is Gastown, which is a vibe-coded slop-heap of “what if we add more layers of management, all LLM, and let it burn through tokens at a prodigious rate?”

Automating software development as a whole will look a lot like this - if employers want to actually replace developers, this is what they’ll do. With more corporate styles and less vibe coded “let’s turn it all to eleven” going on.

Steve’s point in the Gastown intro is that most people aren’t ready for Gastown and it may eat their lunch, steal their baby and empty their bank account. This is true. Few of us are used to dealing with corporate amounts of money and effort and while we think a lot about the human management of it, we don’t usually try to make it into a money-burning code-printer. I think there’s a lot of danger for our whole world here. Unfettering business has never yielded unambiguously good results.

Other tools like this are coming. While I was writing this, multi-claude was released, and there’s more too: Shipyard, Supacode, everyone excited about replacing people is building tools to burn more tokens faster with less human review. They’re writing breathless articles and hand-waving about the downsides (or assuming they can throw more LLMs at it to fix problems.)

I personally want little part in this.

Somewhere much further down the scale of automation is things like my friend David’s claude-reliability plugin, which is a pile of hacks to automate Claude Code to keep going when it stops for stupid reasons. Claude is trained on real development work and “I’ll do it later” is entirely within its training set. It really does stop and put TODOs on the hard parts. A whack upside the head and telling it to keep going sure helps make it make software that sucks less.

Automating the automation is always going to be a little bit of what’s going on. Just hopefully with some controls and not connecting it to a money-funnel and saying full speed ahead on a gonzo clown car of violence.

There’s a lot of this sort of thing.

The Labor Issue

The labor left has had its sights on AI for a while as the obvious parallel to the steam-looms that reshaped mill-work from home craft to extractive industry. We laud the Luddites, who, contrary to popular notions about them were not anti-technology per se, they just saw the extractive nature of businesses using these machines, turning a craft one might make a small profit at to a job where people get used up, mind and body, and exhausted. They destroyed equipment and tried to make a point. In the end they had only moderate success, though they and the rest of the labor movement won us such concepts as “the weekend” and “the 8 hour day”.

Even the guy who made Gastown sees how extractive businesses can - or even must! - be. Maybe especially that guy. We’re starting to see just how fast we can get the evil in unfettered business, capital as wannabe monopolists, to show itself.

Ethan Marcotte knows what’s up: We need to unionize. That’s one of the only ways out of this mess. We, collectively, have the power. But only collectively. We don’t have to become the protectionist unions of old, but we need to start saying “hey no, we’re not doing that” en masse for the parts that bring harms. We need to say “over my dead body” when someone wants to run roughshod over things like justice, equality, and not being a bongo-playing extractive douchecanoe. We’ve needed to unionize for a long time now, and not to keep wages up but because we’re at the tip of a lot of harms, and we need to stop them. The world does not have to devolve into gig work and widening inequality.

Coding with automated systems like this is intoxicating. It’s addictive, because it’s the loot-box effect. We don’t get addicted to rewards. We get addicted to potential rewards. Notice that gamblers aren’t actually motivated by having won. They’re motivated by maybe winning next time. It can lead us to the glassy eyed stare with a bucket of quarters at a slot machine, and it can lead us to 2am “one more prompt, maybe it’ll work this time” in a hurry. I sure did writing webtty.

There’s Something About Art…

Software, while absolutely an art in many ways, is built on a huge commons of open work done by million of volunteers. This is not unambiguously always good, but the structure of this makes the ethics of code generation more complex and nuanced than it is for image generation, writing generation, and video generation. We did in fact put a ton of work out there with a license that says “Free to use for any purpose”. Not to say every scrape of GitHub was ethical at all: I’m sure AGPL code was snaffled up with the rest, and ambiguously or non-permissively licensed too. It is however built on a massive commons where any use is allowed. The social status quo was broken, but the legal line at least is mostly in the clear. (Mostly. This is only a tepid defense of some of the AI company scrapes.)

AI image generation and video generation can get absolutely fucked. It was already hard to make it as an artist because the value of art is extremely hard to capture. And we broke it. Fuck the entire fucking AI industry for this and I hope whoever decided to make it a product first can’t sleep soundly for the rest of their life. I hope every blog post with a vacuously related image with no actual meaning finds itself in bit rot with alacrity.

Decoding the Discourse

It’s helpful to know that the words used to describe “AI” systems are wildly inconsistent in how people use them. Here’s a bit of a glossary.

Agent:

  1. A separate instance of a model with a task assigned to it.
  2. A coding tool.
  3. A tool of some kind for a user to use but that can operate in the background in some way.
  4. A tool models can invoke
  5. A service being marketed that uses AI internally.
  6. A tool that other agents can use.

Agentic: in some way related to AI.

Orchestration: Yo dawg I heard you liked AI in your AI, so I put AI in your AI so your AI can AI while you AI your AI.

Vibe Coding:

  1. coding with LLMs.
  2. using LLMs to write code without looking or evaluating the results.

A coda

In the time it took to write this over a week or more, Claude Opus 4.5 gave way to Claude Opus 4.6. GLM-4.7 was surpassed by GLM-5 just today as I write this bit, but z.ai is now overloaded trying to bring it online and has no spare computing power. All my tools have had major updates this week. The pace of change is truly staggering. This is not a particularly good thing.

I may edit this article over time. No reason we can’t edit blog posts, you know. Information keeps changing with new data and context.

Now go out there and try your best to make the world better.

Wait, fiber optics are how cheap now?!

I ran into a video review of a spiffy little kit for retrofit-wiring a home with high speed Ethernet, invisibly. Critically, it was a fibre optic setup, and the cable is a thread-thin nylon-coated cable that’s barely visible at ten feet, and easily glued into the corner of a room to be truly vanishing.

The kit was $250, but as is my way I immediately headed to everyone’s favorite cheap Chinese vendor of remarkably good but occasionally shady parts. Here’s what I found, after some research.

Fibre optic cable has been getting better and better. G.657.A2 cable can tolerate some relatively tight bends (still not amazing, but good enough for my purposes), and if you want to hunt around or buy the pricy kit, you can get G.657.B3 cables, which can tolerate 5 mm radius bends.

There’s a fair bit of equipment out there that supports 1 gigabit and 2.5 gigabit tranceivers, and increasingly a lot that support 10 gigabit.

I opted to go cheap and get a 1 pair of gigabit tranceivers (billed as 1.25 gigabits, but there’s overhead for encoding on the line, the reality is 1.)

There are several kinds of cable and tranceiver. Old school used multi-mode fiber and a pair of fibers together, one to send and receive. When I started my career, if you wanted to go distance, you needed a single mode tranceiver which was hundreds of thousands of dollars and involved expensive lasers. A single single-mode fiber can handle several wavelengths of light at the same time, so you can buy a matched pair of tranceivers sending on 1310 and receiving on 1550 nm wavelengths, and vice versa. If you’re patient this will set you back a mere $8.

Then you’ll need some cable to plug into it. I didn’t find any unsheathed G.657.B3 cable (there is a bit of white-sheathed out there, but if you want it invisible you have to make it yourself or order it elsewhere.) I opted for the less-flexible but still sufficient G.657.A2-spec cable and … that’s it?!

If you don’t have a router that can handle SFP tranceivers like my BananaPi R3 kits, you’ll need media converters so you get regular old RJ-45 1000base-T on the other side.

I haven’t tried them but media converters are not too expensive if you shop around.

Words of warning:

  • Don’t just get a 40km or 80km tranceiver when you have 10 meters of fiber to light. The optics are tuned for brightness and you don’t want to have to buy an attenuator to make it work. Get a 5km or 100m model. The tolerances are wide, but not that wide.
  • There are several different cable end standards: SC/UPC, SC/APC, LC, and FC/PC, FC/APC type N and FC/APC type R. Mine are SC/UPC (blue connector), which are widely supported. SC/APC is better (green rather than blue connector), but less supported. Also it doesn’t matter for home networking. LC is a slightly bigger connector, and often used in situations with one channel per fiber. FC is newer, and much less supported.
  • Make sure you have a pair of tranceivers, not two of the same end. Don’t just select quantity 2.
  • You may need to turn off autonegotiation on the ethernet port. I did. There’s only one option anyway with that tranceiver so there’s no point.
  • Single mode and multi mode fiber are different and not interchangeable. Connectors clarify that though.

The AI hater's guide to code with LLMs (The Overview)

Introduction

This is the post I don’t think many people expected me to write, and I have (rightly!) surrounded myself with people who are generally uncomfortable to somewhat hostile to “AI”, mostly for good reasons, though I’ll get into the many caveats on that as I go.

As activists mitigating the harms of “AI”, we need to be well informed, and we need to understand what the specific harms are. Treating it with a hands-clean purist mindset will be extremely difficult and as activism, more alienating than effective. These are genuinely useful tools, and pretending they aren’t will not in fact win many hearts and minds.

This post is going to be very long, because in addition to technical context, I’m touching social issues, technical background, discourse norms, context in a culture of rising technocracy and fascism funded by venture capital, and the erosion of our information systems and cultural norms all at once. I can’t get into it all here, but I am not staying away from it on purpose.

Overall, I still believe that LLMs are a net negative on humanity, that the destruction of our infosphere is going to have generational consequences, and that if the whole thing disappeared from the face of the earth tomorrow, I wouldn’t be sad. The damage is would still be out there, but the cheapness of bullshit pervading everything would at least resume being human content mill scale. Not to say that that was good before LLMs came along and made it this bad, but it was better.

That said, that’s not going to happen, and the amount of effort required to make it happen would be much better spent on organizing labor and climate action. The AI industry may collapse in a house of cards. I think it somewhat likely considering amount of financial trickery these companies are using. But as someone I know put it: we’re not just going to forget that computers can write code now. We aren’t.

I want you to think about all of this with an intensely skeptical mind. Not hostile, mind you, but skeptical. Every claim someone makes may well be checkable. You can check! I recommend you do so. My math in this essay will be rough back of envelope calculation, but I think that is appropriate given the tendency of the costs of technology to change orders of magnitude, and situationally for things to vary by at least a factor of two.

And since we’re both operating in the domain of things not long ago considered science fiction, and because the leadership of AI companies tend to be filled with people with a love of science fiction, many of whom won’t hesitate to, as is said, create the Torment Nexus from the popular science fiction novel Don’t Create The Torment Nexus, I suggest one story to read and keep in mind: Marshall Brain’s “Manna – Two Views of Humanity’s Future”.

TL;DR

  • There are open models and closed; good code work needs to be done on stuff that needs very high end hardware to run, at least in part.
  • Chinese models are quite good, and structured differently as companies.
  • Don’t other running things at home to write code unless you’re a weird offline-first free software zealot. I kind of am, and still I see the futility with the hardware I have on hand.
  • Nobody agrees on the right way to do things.
  • Everyone is selling something. Usually a grand vision handwaving the hard and bad parts.
  • I’ll write more about how to actually use the tools in another segment.
  • A lot of the people writing about this stuff are either executives who want to do layoffs, or now-rich people who made it big in some company’s IPO. Take what they say with the grain of salt you’d use for someone insulated by money and who can have free time relatively easily. They are absolutely handwaving over impacts they themselves will not experience.

A note on terms

I am writing this with as much verbal precision as I can muster. I loathe terms like “Vibe Code”, and in general I am not jumping on any marketing waves and hype trains. I’m being specifically conservative in the words I use. I say LLM, not “AI”, when talking about the text generation models at the heart of most of the “AI” explosion. I’ll prefer technical terms to marketing buzzwords the whole way through, even at the cost of being awkward and definitely a little stodgy. Useful precision beats vacuous true statements every time, and the difference now very much matters.

The Models

There are a zillion models out there. Generally the latest and greatest models by the most aggressive companies are called “frontier” models, and they are quite capable. The specific sizes and architectures are somewhat treated as trade secrets, at least among the American companies, so things like power required to operate them and the kind of equipment required is the sort of things analysts in the tech press breathe raggedly over.

the American frontier models include

  • Anthropic’s “Claude Opus”
  • OpenAI’s GPT-5.2
  • Google Gemini 3 Pro
  • something racist from xAI called Grok.

The frontier models are a moving target as they’re always the most sophisticated things each company can put forth as a product, and quite often they’re very expensive to run. Most of the companies have tools that cleverly choose cheap models for easy things and the expensive models for difficult things. Remember this when evaluating anything resembling a benchmark: it’s an easy place to play sleight of hand.

When you use a frontier model company’s products, most of the time you interact with a mix of models. This is usually a somewhat cheaper to run version of the frontier models as the main mode, sometimes offering the true best model as an option, a thing that is sometimes invoked, and the whole thing is hidden behind a façade that mkes it all look the same. Version numbers often resemble cell phone marketing, with a race to hae bigger numbers, “X” and “v” in places to make it seem exciting. There is no linear progression nor comparison of any of the numbers in the names of models or products.

I largely have no interest in interacting with the American frontier model companies, as their approach is somewhat to dominate the industry and burn the world doing it. Anthropic is certainly the best of the bunch but I really don’t want to play their games.

I do not know this for sure, but I expect these models run into the terabytes of weights, more than a trillion parameters, plus they are products with a lot of attached software — tools they can invoke, memory and databases and user profiles fed into the system.

Behind them are the large models from other AI companies, largely Chinese, producing research models that they and others operate as services, and often they are released openly (called “open weights models”). Additionally some of the frontier model companies will release research models for various purposes. All core AI companies pretty much style themselves as research organizations first, and product companies second. Note that nearly every AI company calls its best model a frontier model, whether it fits with the above or not.

Chinese companies and therefore models often have a drive for efficiency that the American ones do not. They are not the same kind of market-dominating monopolist-oriented sorts that VC-funded American companies are. They aren’t as capable, but they do more with less. They’re very pragmatic in their approach compared to the science fiction fueled leadership of American AI companies. These models run in the hundreds of gigabytes and have hundreds of billion of parameters, though most can be tweaked to run some parts in a GPU and the rest on a CPU in main memory, if slowly. They can run on regular PC hardware, if extremely high end hardware, and distillations and quantizations of these models, while they lose some fidelity, fit on even more approachable hardware. Still larger than most people own, but these are not strictly datacenter-only beasts.

Large, capable open models (Mostly Chinese) include:

  • z.AI’s GLM-4.7 and GLM-5
  • Kimi K2.5
  • MiniMax M2.1
  • Deepseek-V3.2
  • Alibaba’s Qwen3-Max
  • Mistral Large 3
  • Trinity Large

Mistral Large 3 comes out of Europe. Trinity comes out of the US, but has a less “win the AI race” mindset. There’s a lot of superpower “We need our own sovereign solution” going on. China, the US and Europe are all making sure they have a slice of the AI pie.

I’m sure there’s more — the field is ever changing, and information about the models from Chinese companies percolates slowly compared to the American frontier models.

Behind these models are specialized smaller models, often sort-of good for code writing tasks if one isn’t challenging them, but I actually think this is where the line of usefulness is drawn.

Medium-small coding models include:

  • Qwen2.5-Coder
  • GPT-OSS 120b
  • Mistral’s Codestral
  • GPT-4.7-Flash
  • Claude Haiku
  • Gemini 2.5 Coder
  • Smaller versions of Qwen3
  • Smaller versions of many other models

There’s also some much smaller models that will run on large gaming GPUs. I don’t think they’re quite useful, they’re very attractive toys that people can get to do some truly impressive things, but I don’t think they’re all that. They are, however, about the capability of what kneejerk AI-haters expect, error-prone lossy toys that if anyone called “the future”, I’d laugh in their face or spit at their feet. Notice how far down the list this is.

The Economics

LLMs are expensive pieces of software to run. Full stop, anything with broad utility is something that requires a GPU greater than most high end gaming PCs, and quite a lot of RAM. I am setting a high bar here for utility, because AI boosters tend to have a frustrating way of equivocating, showing low numbers for costs when it suits them, and high ones for performance, despite not being from the same models. There are domain specific tasks and models that can work in mere small GPU or even Raspberry Pi levels of computation, but for general purpose “reasoning” tasks and coding specifically, right now in 2026, with current model efficiencies, and with current hardware, if you want to use LLMs for writing software, you will be throwing a lot of computing power at it. A $5000 budget would barely suffice to run something like gpt-oss 120b (OpenAI’s open model that is okay at code-writing tasks). Additionally, if you kept the model busy 100% of the time, you might be talking $50-$200 in electricity depending on local prices, per month.

If you spent $15,000 and triple the electricity you could run something like GLM-4.7 at a really good pace.

Water cooling for data centers is probably the most talked about environmental cost, but I think it’s actually a distraction most of the time. Dear god why do people build data centers in Arizona, that’s a travesty, but also that’s a specific decision made by specific people with names and addresses who should be protested specifically.

Datacenter growth at the cost of people driving up electricity demand is a big problem, and we need to get back on the solar train as fast as possible.

This is not inexpensive software to run. However, it’s not an unfathomable amount of power.

Training models is wildly expensive, but it amortizes. There are in fact difficult economic conversations we need to be having here, but it’s all obscured by the fog of “what about the water?” and “AI will save us all and change everything!” that pervades the discourse. The framing of the arguments at large are fundamentally misleading, by basically everyone, pro or anti-AI, and much more about affiliative rhetoric than argumentative. We need to have the arguments, and actually look for and persuade people of the truths. They’re uncomfortable so I fully understand why we’re not very often, but if we want to actually solve crises, we need to talk with actual truths in mind.

With prices of $200/month for “Max” plans, if one uses the tools well, a company would in fact be making a smart decision to get their developers using them. They are definitely below cost, probably by at least 3-5x. Maybe 10x. (Remember that a price shock will come at some point before depending on the economics of these systems in existential ways for a business.)

Even at cost the math works out for a great many use cases.

Light plans are $20/month, and I think that for intermittent use, with good time sharing, that’s quite sustainable. In my experimentation I’m paying even less than that, and while I don’t think those prices will be sustained, I don’t think they’re impossible either.

Most of the big providers and almost all of the hosted open model providers have a pay-by-the-token API option. This is an unpackaged a-la-carte offering, in the style of cloud providers. They nickle and dime you. The model while transparent is hard to calculate. The usual rates are in prices per million input tokens and per million output tokens. Input tokens are cheaper, but interactions with tools will re-send them over and over so you get charged for them multiple times. Output tokens are more expensive but closer to one-time things. Expensive models can be $25 per million output tokens and $5 per million input tokens (Claude Opus 4.6). I expect this reflects a decent margin on the true costs, but I have not a ton to back this expectation up. Most open models run in the realm of $0.50-$3 per million input tokens and $1-$5 per million output tokens. Given that a lot of the open models are run by companies with no other business than running models, I expect these represent near true financial costs. There’s no other business nor investment to hide any complexity in.

The Tools

Most of the tools can talk to most of the models in some way. Usually each has a preferred model provider, and doing anything else will be a lesson in configuration files and API keys. Some more so than others.

Most of the tools are rougly as secure as running some curl | bash command. They kinda try to mitigate the damage that could happen, but not completely, and it’s a losing battle with fundamentally insecure techniques. Keep this in mind. There are ways to mitigate it (do everything in containers) but you will need to be quite competent with at least Docker to make that happen. I have not, I’m going for being a micromanaging busybody and not using anything resembling “YOLO mode”. I also back everything up and am not giving permission to write to remote repos, just local directories.

I know terminal-based tools more than IDEs, though I’ll touch on IDE-integrated things a bit. I haven’t used any web-based tools. I grew up in terminals and that’s kinda my jam.

  • Claude Code is widely acknowledged as best in class, has a relatively good permission model, and lots of tools hook into it. It’s the only tool Anthropic allows with their basic consumer subscriptions. If you want to use other tools with Claude, you have to pay by the token. Can use other models, but it’s a bit of a fight, and a lot of APIs don’t support Anthropic’s “Messages” API yet.
  • OpenAI Codex is OpenAI’s tooling. It’s got decent sandboxing, so that what the model suggests to run can’t escape and trash your system nearly so easily. It’s not perfect but it’s quite a bit better than the rest. It’s a bit of a fight to use other models.
  • OpenCode touts itself as open source, when in reality most stuff is. It’s a bit less “please use my company’s models” than most tools, and it’s the tool I’ve had the best luck with. It has two modes — Build and Plan — and using them both is definitely a key to using the tool well. Plan mode creates documents and written plans. Build does everything else and actually changes files on disk.
  • Kilo Code is both a plugin for VS Code, and a tool in the terminal. It has not just two modes but five, and more can be customized. “Code”, “Architect”, “Ask”, “Debug”, and “Orchestrator”. Orchestrator mode is interesting in thatit’s using one stream of processing with one set of prompts to evaluate the output of other modes. This should allow more complex tasks without failing because there’s a level of oversight. I’ve not used this yet, but I will be experimenting more. Its permission model is pretty laughable but at least it starts out asking you if it can run commands instead of just doing it.
  • Charmbracelet Crush is aesthetically cute but also infuriating, and it’s very insistent on advertising itself in commit messages. I’ve not yet seen if I can make it stop, but it did make me switch back to OpenCode.
  • Cursor — App and terminal tool. Requires an account and using their models at least in part, though you can bring your own key to use models through other services.
  • Cline — Requires an account. IDE plugins and terminal tools.
  • TRAE — IDE with orchestration features. Intended to let it run at tasks autonomously. I’ve not used it.
  • Factory Droid. Requires an account. Can bring your own key.
  • Zed. IDE editor, with support for a lot of providers and models.

TL;DR

I like OpenCode; Kilo Code and Charmbracelet Crush are runners-up. The (textual) user interface is decent in all three, and it’s not loud, it’s not fancy, but it’s pretty capable. At some point I’ll try orchestration and then maybe Kilo Code will win the day. You’re not stuck with just one tool either.

Antagonistic Structures and The Blurry JPEG of the Internet

At its core, you can think of LLMs as extremely tight lossy data compression. The idea that it is a “blurry jpeg of the internet” is not wrong in kind, though in scope it understates it. Data compression is essentially predicting what’s next, and that’s exactly what LLMs do. Very different specifics, but in the end, small bits of stuff go in, large outputs come out. It’s also “fancy autocomplete”, but that too undersells it because when you apply antagonistic independent chains of thought on top, you get some much more useful emergent behavior.

A pattern that you have to internalize is that while lots of these tools and models are sloppy and error-prone, anything you can do to antagonize that into being better will be helpful. This is the thing where I show you how LLM code tools can be a boon to an engineer who wants to do things well. Suddenly, we have a clear technical reason to document everything, to use a clear type system, to clarify things with schemas and plans, to communicate technical direction before we’re in the weeds of editing code. All the things that developers are structurally pushed to do less of, even though they’re always a net win, are rewarded.

You will want your LLM-aided code to be heavily tested. You will want data formats fully described. You will want eery library you use to have accurate documentation. You will use it. Your tools will use it.

You will want linters. You will want formatters. Type systems help.

This pattern goes deep, too. Things like Kilo Code’s “Orchestrator” mode and some of Claude Code’s features work as antagonistic checks on other models. When one model says “I created the code and all the tests pass” by deleting all the failing tests, the other model which is instructed to be critical will say “no, put that back, try again”.

One of the big advances in models was ‘reasoning’ which is internally a similar thing: If you make a request, the model is no longer simply completing what you prompted, but instead having several internal chains of thought approaching it critically, and then when some threshold is met, continuing on completing from there. All the useful coding models are reasoning models. The model internally antagonizes itself until it produces something somewhat sensible. Repeat as needed to get good results.

Even then, with enough runtime, Claude will decide that the best path forward is to silence failing tests, turn off formatters, or put in comments saying //implement later for things that aren’t enforced. Writing code with these tools is very much a management task. It’s not managing people, but sometimes you will be tempted to think so.

The Conservative Pressure

So here’s the thing about LLMs. They’re really expensive to train.

There’s two phases: “pre-training” (which is really more building the raw model, it’s most of training), and “post-training” (tailoring a general model into one for certain kind of tasks).

Models learn things like ‘words’ and ‘grammar’ in pre-training, along with embedded, fuzzy knowledge of most things in their training set.

Post-training can sort-of add more knowledge, giving it a refresher course in what happened since it came out. There’s always a lag, too. It takes time to train models.

The thing is though that the models really do mostly know only about what they were trained on. Any newer information almost certainly comes from searches the model and tools together do, and stuffs into the context window of the current session, but it doesn’t know anything, really.

The hottest new web framework of 2027 will not in fact be new, because the models don’t know about it and won’t write code for it.

Technology you can invent from first principles will work fine. Technology that existed and was popular in 2025 will be pretty solid. Something novel or niche, code generation will go of the rails much more easily without a lot of tooling.

This is, in the case of frontend frameworks, maybe a positive development in that the treadmill of new frameworks is a long-hated feature of a difficult to simplify problem space for building things that real people touch.

In general however, it will be a force for conservatism in technology. Expect everything to be written in boring, as broken as it ever was in 2025 ways for a while here.

They’re Making Bullets in Gastown

There’s a sliding scale of LLM tools, with chats on one end and full orchestration systems of independent streams of work being managed by yet more LLM streams of work at the other. The most infamous of this is Gastown, which is a vibe-coded slop-heap of “what if we add more layers of management, all LLM, and let it burn through tokens at a prodigious rate?”

Automating software development as a whole will look a lot like this - if employers want to actually replace developers, this is what they’ll do. With more corporate styles and less vibe coded “let’s turn it all to eleven” going on.

Steve’s point in the Gastown intro is that most people arne’t ready for gastown and it may eat their lunch, steal their baby and empty their bank account. This is true. Few of us are used to dealing with corporate amounts of money and effort and while we think a lot about the human management of it, we don’t usually try to make it into a money-burning code-printer. I think there’s a lot of danger for our whole world here. Unfettering business has never yielded unambiguously good results.

Other tools like this are coming. While I was writing this, multi-claude was released, and there’s more too: Shipyard, Supacode, everyone excited about replacing people is building tools to burn more tokens faster with less human review. They’re writing breathless articles and hand-waving about the downsides (or assuming they can throw more LLMs at it to fix problems.)

I personally want little part in this.

Somewhere much futher down the scale of automation is things like my friend David’s claude-reliability plugin, which is a pile of hacks to automate Claude Code to keep going when it stops for stupid reasons. Claude is trained on real development work and “I’ll do it later” is entirely within its training set. It really does stop and put todos on the hard parts. A whack upside the head and telling it to keep going sure helps make it make software that sucks less.

Automating the automation is always going to be a little bit of what’s going on. Just hopefully with some controls and not connecting it to a money-funnel and saying full speed ahead on a gonzo clown car of violence.

There’s a lot of this sort of thing.

The Labor Issue

The labor left has had its sights on AI for a while as the obvious parallel to the steam-looms that reshaped millwork from home craft to extractive industry. We laud the Luddites, who, contrary to popular notions about them were not anti-technology per se, they just saw the extractive nature of businesses using these machines, turning a craft one might make a small profit at to a job where people get used up, mind and body, and exhausted. They destroyed equipment and tried to make a point. In the end they had only moderate success, though they and the rest of the labor movement won us such concepts as “the weekend” and “the 8 hour day”.

Even the guy who made Gastown sees how extractive businesses can - or even must! - be. Maybe especially that guy. We’re starting to see just how fast we can get the evil in unfettered business, capital as wannabe monopolists, to show itself.

Ethan Marcotte knows what’s up: We need to unionize.. That’s one of the only ways out of this mess. We, collectively, have the power. But only collectively. We don’t have to become the protectionist unions of old, but we need to start saying “hey no, we’re not doing that” en masse for the parts that bring harms. We need to say “over my dead body” when someone wants to run roughshod over things like justice, equality, and not being a bongo-playing extractive douchecanoe. We’ve needed to unionize for a long time now, and not to keep wages up but because we’re at the tip of a lot of harms, and we need to stop them. The world does not have to devolve into gig work and widening inequality.

Coding with automated systems like this is intoxicating. It’s addictive, because it’s the lootbox effect. We don’t get addicted to rewards. We get addicted to potential rewards. Notice that gamblers aren’t actually motivated by having won. They’re motivated by maybe winning next time. It can lead us to the glassy eyed stare with a bucket of quarters at a slot machine, and it can lead us to 2am “one more prompt, maybe it’ll work this time” in a hurry. I sure did writing webtty.

There’s Something About Art…

Software, while absolutely an art in many ways, i built on a huge commons of open work done by million of volunteers. This is not unambiguously always good, but the structure of this makes the ethics of code generation more complex and nuanced than it is for image generation, writing generation, and video generation.

AI image generation and video generation can get absolutely fucked. It was already hard to make it as an artist because the value of art is extremely hard to capture. And we broke it. Fuck the entire fucking AI industry for this and I hope whoever decided to make it a product first can’t sleep soundly for the rest of their life. I hope every blog post with a vacuously related image with no actual meaning finds itself in bit rot with alacrity.

Decoding the Discourse

It’s helpful to know that the words used to describe “AI” systems are wildly inconsistent in how people use them. Here’s a bit of a glossary.

Agent:

  1. A separate instance of a model with a task assigned to it.
  2. A coding tool.
  3. A tool of some kind for a user to use but that can operate in the background in some way.
  4. A tool models can invoke
  5. A service being marketed that uses AI internally.
  6. A tool that other agents can use.

Agentic: in some way related to AI.

Orchestration: Yo dawg I heard you liked AI in your AI, so I put AI in your AI so your AI can AI while you AI your AI.

Vibe Coding:

  1. coding with LLMs.
  2. using LLMs to write code without looking or evaluating the results.

A coda

In the time it took to write this over a week or more, Claude Opus 4.5 gave way to Claude Opus 4.6. GLM-4.7 was surpassed by GLM-5 just today as I write this bit, but z.ai is now overloaded trying to bring it online and has no spare computing power. All my tools have had major updates this week. The pace of change is truly staggering. This is not a particularly good thing.

I may edit this article over time. No reason we can’t edit blog posts, you know. Information keeps changing with new data and context.

Now go out there and try your best to make the world better.

A fall-winter micro-calendar of breakfasts

Toast and Herbs Season

The bounty of summer is coming to an end, but the farmer’s market is still running. There’s still herbs in the garden and a remaining bounty of hearty greens. It’s also still warm enough that you don’t have to heat the house to let bread rise, so it’s time to try some of the less holiday baking like sourdough, or time to swing by your local bakery for baguette. Toasted bread with butter or cream cheese, some herbs or a light salad and an egg is perfect. Save the heels of the bread and anything that goes stale or dry. Cube it up and dry it out and use it later.

Apple Crisp Season

From the first bite of fall, not just a nibble, but the first time the cold reminds you it’s coming for real, it’s time to make apple crisp: healthy and full of fiber if you don’t load up the sugar too bad. Easy to make in large batches, you can have apple crisp for breakfast for weeks if you want. It’s good for the relatively busy season of school events and pre-holiday holidays.

Pumpkin Pie Season

As the apple selection dwindles, apple crisp gives way to pumpkin pie and custard: this is the decadent-feeling but surprisingly healthy breakfast. A pie is pretty easy to conjure up with a store-bought crust, and if you make it without one at all, you have a delightful custard. It fits right in with the North American holidays. Remember that pumpkin is a squash, custard just means eggs and milk and maybe a bit of sugar. That’s a healthy breakfast! You can even make them savory if you want some variety.

Bread Pudding Season

Those bread ends you’ve saved for the past few months, cut up in cubes and made with a custard make a delightful breakfast. Those nutmeg and cinnamon flavors remind us of the soon-to-be-ending holiday season’s peak. Don’t forget thanksgiving stuffing is also a bread pudding, and try some new variations. Cranberries, celery, onions, mushrooms. There’s so many variations.

Porridge Season

In New England, after the winter holidays, the light is coming back but it’s not there yet. It’s still dark and cold but after the heavy food for a month or more, it’s time for something different. It’s time for the hot oatmeal porridges, maybe some cream of wheat or rice congee. Once the oatmeal is losing its appeal, go for a savory congee with bits of roast chicken, pork rib meat, or some medium-boiled egg. A bit of ginger and soy sauce for flavor, and topped with some green onion makes it an appealing dish rather than yet more beige mush.

Biscuits and Gravy Season

Spring is around the corner, we’re getting more active in the longer days but we still need energy to stay warm. The pantry is a bit bare: it’s been months since harvest time, and it’s easy to be down to just staples. Good thing good biscuits only need a few ingredients: self-rising flour or flour and baking powder, a bit of salt, some fat - butter or cream. White gravy too is just flour and milk with whatever flavor in it. Black pepper. Sausage. Serve it with eggs, and some meat on the side if you really need the fat and protein.

Procedurals are our culture trying to reconcile itself

My husband watches a lot of TV procedurals, and so I end up watching a fair number with him and I spend a lot of time thinking about the cultural meaning of them and the popularity they enjoy.

Some of them are naive fantasies, for sure, and the appeal of neatly-tied-up the-system-works bad-guy-is-put-away is well known and talked about, but I think there’s something more subtle going on in a lot of them.

In a bunch of ways they live at the center of the Overton window politically: they’re inherently very centrist shows, by virtue of seeking a wide audience, and they lean to a certain kind of conservatism in their structure – the predictable, good guys win nature of them, but fascinatingly they tend to tackle a fair number of somewhat progressive subjects and work them into these stories. SVU tackled themes around sexuality; FBI is extremely self-conscious about federal vs state power; The Wire is of course lauded for its nuanced and detailed takes around the war on drugs.

Writ large, they are a process of sense-making, integrating the very real problems and inequalities in the world into storylines and trying to make them make sense against a backdrop of ‘the system’. Legal and cop shows will sometimes (not often enough) delve into how unfair the public defender system is, how rigged against various groups the system is. Cop procedurals will often dig into the injustices of the system – not enough to show the system as being in the wrong very often, but it’s trying to reconcile the various conflicting truths present in our culture. Even more so, accepting that TV shows will not themselves be perfectly consistant as different writers handle different episodes, we see a variety of takes on related subjects all pressed together.

Like cop procedurals, medical dramas spend even more time integrating social issues into the fabric of their shows – ER’s poor treatment of a transgender patient was flipped in a Grey’s Anatomy episode with a similar structure. The progress and reconciliation of the place of transgender people in society is at least reflected in these shows. Similarly we similar changes in sense made for gay rights, for sex worker stigma, for teen pregnancy stigma over the course of medical shows. And most of them at some point deal with the complexity and brokenness of American insurance and health care systems.

Not to say all frames are progressive at all: “24” spent a great deal of time normalizing war crimes and heinous abuses in the name of expedience; FBI normalizes ubiquitous surveillance; SVU often repeated and reinforced a stereotype of women as uncomplicatedly vulnerable. These shows are all some variety of problematic in many ways, but the formula works for many reasons, and the popularity is undeniable.

In the 2020s, terrorists are back to being portrayed as brown, middle-eastern, but not always: there’s a growing trend of showing white, traditionalist, racist and religious cults as the breeding ground of terrorism. Not enough to reject the dominant narrative, but at least willing to complicate it a bit. It’s still going to show us the National Enemy Du Jour: This week it’s Venezuelans, Pakistanis, Somalis, Yemenis and Afghans getting the short end, where it was Iraquis and Lebanese and Libyans and Cubans before, and China the amorphous and distant Bad Place, displacing Russia and the Soviet Union for that spot. The National Allies are usually conspicuously absent. Saudis will rarely if ever be mentioned, Israelis are too hot a topic for most shows to touch, and these shows have a strong tendency to avoid looking at anything we’re too mired in complexity about. Conflict makes good TV, but complex feelings less so. Characters like Omar Zidan in “FBI”, are themselves representatives of the marginalized groups, and this is always played for the tension, and any given show can only bear a few characters dealing with this complexity, while other issues have to be wrapped up in the one or two episode units before moving on.

I do wonder how much these shows entrench views of the status quo or normalize things that really should not be – certainly they fetishize universal surveillance, and they do reify a lot of tropes and stereotypes – but at the same time, they seem to be doing the work of processing the changes in our culture in ways that actually are tolerable to a conservative audience.

On watching Star Trek

I make no secret of the fact that my favorite TV franchise is Star Trek, though I’ve written about it very little. I spent the first bit of my life in suburban South Denver, which is relevant for having one of the Paramount affiliate independent stations that aired Star Trek. They managed a perfect schedule, too, by putting it in the time slot after ABC aired M*A*S*H, so my sister and I would watch them back to back. My father looks like the spitting image of Alan Alda, so it was easy to like Hawkeye and get into the show, but it was watching them back to back that really was the beginning of my political awakening, as far back as my tween years.

Both shows have such a strong and persistent sentiment woven through them that people should do the right thing, and organizations make that hard. They go about it in extremely different ways – MASH showing the failures of the system and just how hopeless it can be, which was then a commentary on the US-Vietnam war, even though the show’s setting is the US-Korean war, and Star Trek showing a utopian template of a system that works in a humanist, benevolent way. Both dance close to engaging with communism and socialism in real ways, but both shy away from it, partly out of inconsistent writing, and partly due to needing to not piss off the network executives. Despite this, Star Trek serves as the template for “fully automated luxury space communism”, even though it never really analyzes it deeply, and MASH will always have a place in my heart for giving it the context I started with.

I’ve long said that most science fiction can be thought of as fan fiction for sciences. Stargate is archaeology fan fiction. The Expanse is military science and political science fan fiction. Star Trek is both anthropology and political science fan fiction, as well as in an extremely tangential way, military science fan fiction. Arrival is linguistics fan fiction. Annihilation (and Star Trek: Discovery) are mycology fan fiction among other things.

Star Trek has also always had this strangely bifurcated fanbase: People, mostly men of the “reads books about World War Two” sort, who like military science fiction and some of whom even manage to miss the social critique in the series, and people, often women, who are here for competent people doing what’s right against a backdrop of anthropology, sociology and politics. Star Trek’s early writers had many women among them, Dorothy Fontana among the most influential, and I think it has shaped the series for the better in ways that no other franchise has managed. Not to say the franchise is unflawed, but it’s mostly worthy with its flaws instead of destroyed by them.

A while back I got my husband introduced to my love of Star Trek, finally, after a couple false starts. Some of it required just skipping the bad episodes. Episode guides were very useful:

I can describe why I like it while I give my own suggestions. While I’ll leave delving into the episode level to the viewing guides, I’m going to touch on what’s good and bad in each series as I go, and I’ll give a few spoilers, but since the show doesn’t structurally rely on surprise much, I think that’s fine. There’s a little mystery in Enterprise, but it’s also not the part of the franchise I am going to get into much depth about. Even looking at the episode list tells you whether the Big Bad gets defeated or not, so there’s really little to spoil. Discovering this thing is about seeing how the pieces fit together and, in my opinion, with how it connects with what’s going on with our own world here and now. Science fiction is always about the people writing it.


There are three major eras of Star Trek production: the original series and its followups, The Next Generation and the 90s syndicated spin-offs, and “New Trek” which is designed for streaming.

There are four time periods that the shows are set in. Chronologically, that’s the prequel series “Enterprise”, set over a hundred years before the original series in the 2150s; the original series era, set in the 2260s; The Next Generation era, set a hundred years later in the 2360s and the far future starting in 3188.


Mostly skip the original series (“TOS”). Get a feel for where it all started, but know it’s weird 1960s and 1970s low-budget space silliness: TV in that era was very much still being derived from theatrical performance, and the studio system’s legacy played out in television far longer than it did in movies.

The characters and setting set a lot of the tone for the future series, but it is full of gonzo plots involving greek gods being aliens, psionic powers at the edge of the galaxy, planets that are somehow exactly like one specific time and place on Earth for expository reasons. It features TV’s first interracial kiss, but was bowdlerized by the network. White actors in something like blackface play Klingons, something that carries through to other series, but feels particularly egregious with the low budget of the original series. And there are so many mini-skirts and bare navels.

Knowing the characters is useful because other shows make callbacks to them, but if you aren’t feeling any particular episode, skip it. Except The Trouble With Tribbles. Kirk, Spock, and McCoy form an interesting trio as a fictional trope that carries out in other Star Trek and other fiction in general. Delegative but commanding leader, analytical scientist, and hotheaded doctor are a dynamic setup for stories.

The politics are so incredibly incoherent episode by episode: sometimes the Federation uses money and the crew gets paid; sometimes they don’t. Sometimes it’s a more clearly military organization, sometimes it’s not. There’s an ongoing theme of colonization without colonialism, but a lot of stories draw on colonial conflicts for their plot points.


The Animated Series (“TAS”) is mostly even sillier than the original series, being animated. It introduces some species that would have been too hard to do in film on the budget the original series had. Entirely skippable, but if you like the original series and like animation, watch it. Also cels from the animated series turned into widely available and valuable collectors items, so there’s a lot of media generated from them.

The movies are mostly skippable. Some of them are fun. Some are not. There were six and a half movies starring the original cast and crew; none of them are terribly important to the franchise. The even numbered ones tend to be better than the odd numbered ones. I have a soft spot for the first film, which developed the theme music and is very pretty in that early 1980s high budget scifi way. If you liked 2001, watch Star Trek: The Motion Picture. There’s also a visual callback to it in Discovery that I think is beautiful.

Watch The Wrath of Khan (II), The Voyage Home (IV, “the one with the whales”). Read a spoiler for Generations, it’s not a very good film. The Undiscovered Country (VI) is not amazing either, but starts the series toward grappling with the political effects of a neighboring empire collapsing.


Since I’m talking about how unimportant the movies are to the franchise, there’s a spin off alternate universe Star Trek series of movies, which is widely ignored in canon, and is “What if J. J. Abrams did Star Trek? Would it turn into Star Wars?” The answer is yes, yes it will. Star Wars has always benefitted from its open universe and always having something new and a new weird place characters can go, and lots of action sequences. It’s a Space Western. Star Trek is not: it’s always benefitted from the politics and interplay of races and empires and factions, which a more closed world where you don’t drop plot threads all over the place and never pick them up.


Star Trek really hits its stride in The Next Generation (“TNG”), and there are some excellent story lines and episodes. Maxistentialism’s guide hits the high notes for sure. Picard, Data and Riker form the same trio archetypes as the original series in some ways, but they shake up the formula. Troi is deeply underwritten, as all the women are initially. Gene Roddenberry’s sexism, and worse, Rick Berman’s, leave their marks on the show.

There is a series-spanning theme of “is humanity ready to exist among the stars?”, with a lot of different aspects to it. Are we morally ready? Are we prepared for the disaster that could be out there? Have we grown enough to get over petty squabbles? What about when our neighbors haven’t? What if personal resource constraints are eased? What if societally, there are still limits? What does that do? Are we fit to judge others? How does our sense of right and wrong fit into the universe?

Add to that the franchise spanning concern with what it means to be human – an ongoing concern that defines Spock in TOS, Data in TNG, and later Burnham and Georgiou and Spock in DISCO, and you have some really good storylines develop.


Deep Space Nine (“DS9”) is widely regarded as the best of the franchise for good reason. It has longer plot arcs, excellent writing, fewer dud episodes (And even some of the duds are fun, just not meaningful). All around good watching, and Maxistentialism’s list is solid.

Deep Space Nine was also, arguably, stolen from the plot of Babylon 5. The similarities are more than a little coincidental, and the script for B5 was pitched to Paramount, who turned it down and then went and did DS9 right after. Hmm. Babylon 5 is just as good if more cheaply produced than DS9, watch them both. It’s nice to see two complete riffs on a core idea. Excellent science fiction right there.

It delves into the edges of the Federation, and looks deeply at what recovery from fascist colonialism looks like, and it engages with religion in a way that Star Trek normally does not. It’s set in a space station over a newly-freed planet, and everything is broken and kinda terrible. Starfleet is called in to run the station since they have expertise, but it’s nominally not part of the Federation, exactly. Membership negotiations are pending.

Then a wormhole – a gateway to new resources in another part of the galaxy – is discovered, suddenly putting Bajor and Deep Space Nine on the map. The whole series unfolds from that, and there are a bunch of mirrors shined on relations on Earth here: relations among species, the nature of colonialism especially when there is a world to be developed and that wants development to some degree adds real complexity.

Rick Berman’s involvement shows in how two of the ongoing characters manage to have an extremely gay relationship on screen without ever acknowledging it as such thanks to his homophobia. There is a lesbian kiss that’s well portrayed, and the behind-the-scenes talk is that it was actually very hard to make happen and people, and director Avery Brooks (who also plays Captain Sisko) spent real political capital to make it happen.


Star Trek: Enterprise (“ENT”) is mostly lousy. Rick Berman left deeper marks on the series and it suffered for it deeply. It’s got some clever retcons and contextualizations of things that happened in other series, and some nice ideas about what the path from “horrible wars on Earth” to “spacefaring civilization” looks like, but they’re not particularly deep. The entire series is skippable without harming understanding of the franchise’s world. The above episode guide hits the good parts, but it’s still among the Worst Trek. Transporters are experimental technology, which is a nice hack to keep them from being the device you have to avoid solving all problems with. Good retcon on that front.


Star Trek: Voyager (“VOY”) is a continuation of the aesthetic of The Next Generation, this time with a woman in command. Kate Mulgrew does a great job at being the steely starship captain, and the premise is that the ship is flung into the far and unexplored (by humans) parts of the galaxy and has to get home, a trip that might take the rest of their life if they don’t find a shortcut.

It was intended to give new life to a now crowded universe, free of the politics that play into the rest of the franchise’s universe, and able to explore and tell those one-off stories again. It worked well enough, but I don’t think it’s the strongest show of the era. I have moral issues with some of the framing of Captain Janeway’s actions as unambiguously right when they really weren’t, and there’s a lot of justifying doing terrible things to others because it seems necessary.

It’s quite watchable, and a good watching guide would help. There are lots of good tie-ins to events in other series, and if you are a completionist you’ll want all of that.


Then there’s “new Trek.” After the hiatus with ENT, and after Gene Roddenberry’s death, the franchise might have well come to an end, but of course we can’t have anything without monetized nostalgia so the franchise engines started up again. Paramount started cracking down on copyright over their characters, got kinda nasty with fan writers and fan filmmakers, and has in general somewhat “Star Warsified” Star Trek. For the core of the franchise itself this has not always been terrible, and I think some of the new shows are the best of the bunch.

New Trek is designed for streaming, not syndication, so skipping episodes is much harder.


Star Trek: Discovery (“DIS” or “DISCO”) is the first new Trek. It got hit with the pandemic so it has some really uneven tones, since actors couldn’t be near each other in groups for a significant part of its run without very expensive and annoying precautions. It held together better on a rewatch for me than I’d remembered, but its first and second seasons are not amazing. It is pretty continuous story though, so it’s hard to skip around. The show is weirdly paced for other reasons, too, because they introduced an instantaneous travel method, which is silly. It presents the same story problems that the transporters do, in that you have to contrive a reason for it not to work over and over again to actually tell a story because if you don’t, your magic device can solve everything.

The first two seasons are the most compromised, the second two are the most interesting. The characters are fantastic, even if the writing is bonkers sometimes. They were trying something new, and if you watched the original series, you know bonkers is a thing that they tried there too.

  • Season 1: they redesigned the Klingons, incoherently, and you could do well to pretend they’re a previously unshown race. Some truly bonkers medical science that makes no sense, but just know that Discovery has even more handwavium than Star Trek usually does and roll with it.
  • Season 2: The big bad is Section 31, the secret intelligence division of Starfleet, gone rogue. I hate Section 31 and it’s the worst concept the writers ever developed. In DS9 it was treated as unambiguously bad. In DISCO, it’s not, and that’s a deep flaw in the show.

Picard (“PIC”): A retrospective on Captain Picard from TNG, and leans into all of my least favorite things about how the character was written. It’s well acted, but the plots are not strong, some of the things they introduced are just not great for the world, and didn’t even make good commentary on our world. This series is very much about the incoherence of our time in our world right now, and it has no better answers or stories than we do right now. It’s stupid because humanity is stupid. There’s some very nice fan service in the show though. I could skip it. The key pieces: the synthetics rebellion is a weak commentary on colonialism and slavery. But not in an interesting way. They leaned into the Romulan homeworld having been blown up and it’s a weak commentary on fallen empires and maybe on climate change induced migration and homelessness. But it’s not good commentary. Romulans have become Space Elves.


Back to DISCO. The second half of the show’s run is overall excellent: the lots of fun new technology, lots of fun new visuals, and lots of fun new constraints, thanks to events that have transpired in-world between the end of the previous era and this new look at the far future.

  • Season 3: Flung into the future (3188) and looking at a Federation that has crumbled. Not a look at our own world right now. Nuh-uh. Certainly not. Some bad plot decisions. Solid season and really showing what New Trek is. Some of what happened in PIC is now actually being dealt with, 600 years later. Now that is how it’s supposed to be done. The overarching theme of questioning and rebuilding institutions that have been corrupted or destroyed and recovering the good parts ring very strong here.
  • Season 4: Meet new aliens. Good callback to the visuals of the original series. More bad plot decisions. Please stop blowing up my favorite planets as plot device. It was stupid in the movies, it’s stupid here. You can have smaller stakes than that if a character cares about the outcome. Come ON. All in all a solid season, and the theme continues.
  • Season 5: Calls back to some of the 1960s science fiction concepts about gods and time and civilizations of unimaginably long ago or even unimaginably outside our universe, all while continuing those themes. It’s good, and a good place for the show to end.

Strange New Worlds (“SNW”): this is the crown jewel of Star Trek if you ask me. Anson Mount was an amazing choice for captain. This is the five year mission before the original series, with the previous captain. The storylines are a little more episodic, but connected nicely in an arc. Same quality of storytelling as DS9. Just as good character dynamics as Discovery. Production values are off the charts. The design of the Enterprise actually feels like the original series only with a high budget. A truly amazing feat. Watch it all. It’s one of the best shows on the net.

One thing you have to know about both DISCO and SNW is that it’s gay. It’s really really gay. It’s not particularly explicit about it, but this is a crew full of ‘moes. There’s an explicitly trans character, and while I don’t love the plot about that, it’s nice to see anyway. There’s a gay couple, who end up being space dads to a younger queer prodigy. The way it’s acted, you’d have a hard time convincing me any character is particularly straight. Most of the time they don’t make a big deal of it and it pretty well works.


The one place where my opinions do not align with anyone else’s is that I don’t like the animated Lower Decks (“LD”). I think it is cynical, a little boring, and rooted in cheap fan service. There are some high points and they tell a few good stories, and the crossover with SNW is fun, even if it’s not good. It’s The Office and Futurama only it’s Star Trek. The whole show is skippable. Fun if you like the fan service, but skippable.

Every other Trekkie I know loves it, but I just can’t.


And then there’s the one everyone ignores and shouldn’t. Star Trek: Prodigy (“Prodigy”) This is a kids show, like Star Wars: Rebels. It starts off dumbed down. The first four episodes are Kid TV. Good kid TV, but it’s the sort of things parents have to sit through, not enjoy.

But then.

The rest of the show is very Star Trek and it is some of the finest writing. It’s still a kid’s show, so it keeps a light tone and solves a few things too easily, but it hits a bunch of great character notes, fills in the universe a little, introduces a new species coherently, talks about the dangers of cultures coming in contact with each other, unintended consequences, regret, shame, destiny. It’s excellent and well worth watching. It’s a little preachy in places, and it’s not particularly subtle, but it’s good.

The show is set in the nearer part of the Delta Quadrant, where Voyager came near. You watched Voyager right? Captain Janeway is back, and this show is a great reason to have watched Voyager. It lands some great tie-ins to the plot of Voyager without compromising into mere fan service.


I don’t have a point to end this with other than “Star Trek is good and worth watching”, and it’s been a super meaningful part of my life since I was a tween. It’s a show with a lot of richness, and I like it a lot.

Sentences short and long

There is some discourse going around my corners of the Internet lately, originating from this LessWrong post about the lengths of sentences in written works of fiction decreasing over time from nearly fifty to a perhaps unfair but poignant twelve in recent publication such as the Harry Potter books. The bulk of the change happens before the 20th century, so this is not an internet phenomenon, and it does span the time our society transitioned from an oral to a written culture and the advent of silent reading.

I spent most of yesterday in bookstores with this discourse in mind and it was immediately obvious that something was going on. I don’t know about the overall trend, but I can certainly trace something in either the styles for writing over my lifetime, or the selection bias that determines what books are kept, or at least what sort of readers keep books.

The sentence lengths were striking: fully 2/3 of the books were simple sentences, nearly entirely. Not a complex or compound sentence among most pages, and while here and there maybe a compound one, and maybe extremely rarely a complex one, never complex-compound. It varies by genre just how profound the effect is, and certainly the science fiction shelf is one of the worst this way, but everything is very much written simply. Sentences begin with conjunctions on nearly every page instead of the complex sentences, and ‘but’ and ‘and’ the only conjunctions used. “While” I know is derided by the ‘writing 101’ advice that litters the internet, but there’s not an until or unless or even an if in most of it. The subjunctive mood is almost entirely absent from most texts in science fiction and fantasy, and in anything we’d call a “plot driven” book, however much as a writer I detest the oversimplification that label brings.

Pop into any bookstore, in most books, and you will find short sentences, many starting with ‘And’ or ‘But’. Find a Booker Prize winner or translated work, and you will not find this to be the case. Even more striking to me though was when I ran into a used bookstore: even the pulpiest science fiction novels were much more richly structured. Star Trek and Star Wars novels, the sort churned out by the dozens in the 80’s and early 90’s, and even the rather obviously early examples of cheap independently published books telling the story of an AD&D game, marketable only to its tiny readership because of Leisure-Suit-Larry-quality sex jokes, had more in-depth sentence structures, too.

I do wonder if this is the metastasized result of writing advice that advocates paring each sentence to the bone and beyond, leaving the writer, as E. B. White said, “seemed in the position of having shortchanged himself”. (It is not lost on me that one of the examples in the LessWrong post is Stewart Little, written by that E. B. White, the same White of Strunk and White’s Elements of Style where the advice to shortchange oneself originated.)

Certainly for academic text, known for its run-on wordiness and circumlocution in lieu of getting to the point could use such a tightly-wielded scalpel, and the entire genre of “literary fiction”, if one dares call it that, is written with prose that if not purple, yields a rather deep lavender hue and might take this advice but it has become pervasive, not a suggestion for improvement of a specific failing of a specific sort of writer, but an absolute rule from which one must not deviate. Modern writing tools such as Harper issue warnings when sentences reach forty words. While the run-on sentence is a grammatical hazard best avoided, the contrasts of complex-compound sentences and the layered texture they bring is not easily replaced by simple statements in rapid succession, as is the style of modern fiction.

Not unrelated is the tendency for modern fiction to be written with a style that emphasizes immediacy of experience, rather than any sort of context: first person narration, invisible narrators, and tight focus on single characters point of view are pervasive in advice for writers of all levels of experience, as a rule that only the most sophisticated might break with impunity. I do wonder if editors and the commercial fiction presses have set up a feedback loop that creates an ever-shrinking expectation of simplification.

That said, I do not think this is a moral quandary. Ninety percent of everything has always been crap, and there is no shortage of lyrical writers with fantastic sentences, and we’d do well to encourage others to take the time to really read and take time with the written word and enjoy the prose, and at the same time remind ourselves that it’s okay for stories to be fun. We can, time and time again, look to see what has been discarded by the meat-grinder of commerce, dust it off, give it fresh paint, and include it in the next submissions. The consolidation of the publishing industry is not contingent on the length of sentences in the books printed. We are still capable of lyrical complexity, even if it is out of fashion. Fashion will careen back and forth as it always does.

This author isn't dead

I went on the North Shore Book Trail bookstore crawl yesterday with a dear friend and my husband. One of the shops is a delightful used book store, the kind of place I spent my childhood yearning to visit. The sort of place that still has a rack of red-covered Happy Hollisters books, Hardy Boys novels all in a row, and the Star Trek novels I binged as a tween.

Some of the best finds though were some of my great aunt‘s books, four of her Shirley McClintock books: “Death Served up Cold”, “Death and the Delinquent”, “Dead in the Scrub”, and “Deservedly Dead”. She was a notable writer for her work in the 1980s feminist science fiction scene, with award winning novels, but she used pen names for works in horror and mystery, in the era when grocery store mass market paperbacks were in their prime and separate branding was critical for their commercial success.

What’s fascinated me though is that while we weren’t close — I visited her ranch a few times with my parents, and that’s it — so much of her style is deeply familiar. I vaguely knew she was a writer, but didn’t have any real idea what that meant at the time. I was probably eight or nine or so, and the ranch had been sold by 1992 when “Deservedly Dead” was published and the author note said she moved to Santa Fe. The book is nakedly autobiographical in places: the main character, like my great aunt, raised Belted Galloway cattle on her ranch, mostly as a hobby rancher. The book’s plot hinges around a new rancher and neighbor to the main character from back East destroying the landscape and not understanding the fragile Western ecosystem. I wonder very much if this is what prompted the sale of the my great aunt’s ranch in Colorado and the move to Santa Fe. Destruction and the deep love of a place that would feel deeply violating when it is developed or stripped bare is a feature of psychology that I deeply share, and most everyone in my family does. It’s not uncommon in the culture where I’m from to have these sentiments: the land in the West is fragile, an arid ecosystem of careful balances, and unlike anything in Europe. There’s an inherently un-European worldview that comes from beginning to love these places, and a lot of sentiments inherited from the peoples the lands were stolen from, the Diné (Navajo), the Zuni, the Arapahoe, the Ute, and so many others, are shared more generally. This is the land of Rosemary Trommer and Terry Tempest Williams too. Generations of writers have loved these landscapes and fallen in love with them.

I can’t figure out which parts of the story are autobiographical. The main character in these stories, is a familiar persona: she’s written as a somewhat conservative (in the Western US sense), cantankerous, matter of fact older woman, and very self-possessed. So many of her views are things we’d now associate with progressive politics, but in 1992, were not so clearly. Politics in the West are not aligned the way they are in the East and on the coasts and one day I’ll write more in depth about that. My great aunt’s politics were staunchly feminist, if rather Second Wave: I’d like to think she’d have come down on the right side of things with regard to transgender people, though I’m not positive. She was the head of Planned Parenthood in Colorado for a long time, and I suspect in solid Second Wave feminist ways, wrote off racism in ways we’d now find a bit regressive, though also perhaps pleasantly tractable rather than the entrenched battle lines we have right now in 2025. Some of these views come out in the character of McClintock. I don’t think the choice of name is accidental either, a first name that is a close cousin to my great aunt’s, and yet another Scottish surname like her pen names are. I think the autobiographical nature is only thinly veiled.

That particular brand of ‘conservative’ is something I usually just called ‘cussedness’, a mule-like stubbornness to changing views until they actually make sense, both emotionally and factually, and there’s a sort of arid West cultural trait of being somewhat sparse with words and yet keeping one’s finger on the pulse of one’s emotions. Hotheadedness is not appreciated there, and speaking honestly is appreciated — if it’s both relevant and needs saying.

The character has a tendency to rant polemic (which sets her at odds with the more reserved characters) and to obsess over facts not fitting (which I read as being tacitly appreciated, if impolite). These are traits I share.

It’s fascinating being able to analyze an author like this more closely than usual. I do have special knowledge of who she was, though none of it particularly intimate, it’s a bit more specific. I fault literary criticism that doesn’t take the author’s nature into account as fatally incomplete. No work exists in a vacuum, and hastily produced commercial genre novels especially leave a lot of marks on the page that come straight from the author’s interior, and not a calculated meaning external to them. We like to think of literature as divorced from authorship, every choice made deliberately and in service of the story, but none of us is so omniscient even of ourselves as to be able to do that. It’s a useful lens for some analysis, and it’s good to be able to separate any particular notion from its author to examine it, but that’s one tool in the kit and not the entire toolbox.

Anarchism vs Infrastructuralists

I keep my feet in two worlds politically and it is baffling how much shared value they have, and how little shared context there is.

Anarchists are an outsider politics. It’s primarily people who have disavowed the system, or never learned how to enter it. I don’t mean capital-A Anarchist groups, though those are mostly unfairly branded, too, and explicitly disenfranchised from the bureaucracy of the United States, but I do mean the little-a anarchists: anarchism includes a lot of people the system does not admit easily. It’s got a lot of good ideas and sometimes a lot of energy. But anarchists are also, culturally, allergic to the idea of modifying our existing systems. I don’t actually believe this is a value held, but a belief, sometimes born of experiences, sometimes out of not understanding, sometimes out of the system rejecting them (or expecting to be rejected).

On the other hand are a cadre of people who are, more or less, democratic socialists. Not like card carrying members of any organization, necessarily, but really working in policy, in tech, in government, in consulting, in engineering, and trying to build a better world. The people i love most in this space shy away from the technocratic controls of liberalism, and understand deeply that the state, the government, the system, only ever operates in response to an imperfect model of the world. They understand the problem of legibility and illegibility, and try to moderate the systems they build to fail more gracefully—or, when possible, succeed fully still—when they meet the real world. They work to reduce harms. They’re socialists, on average, because socialism is the end of a political spectrum where people get what they need. It’s not big-S Socialism. They’re democratic, because to be responsible to the people you represent or work to support means actually listening. They are process-democratic, not just vote-democratic. They build consensuses, sometimes rough, as best they can. A vote that doesn’t get everyone on board is at least in part a failure.

I wish these groups would talk to each other more, because they have a lot of good ideas, but they don’t speak the same language very often.

Anarchists have long tried to build systems of mutual aid, but so often it devolves into endless streams of GoFundMes to prop up the abject failures of our broken systems. There are better, too: Bikes not Bombs, Food not Bombs, all the pop up kitchens serving food to all comers, those are the kinds of systems—and infrastructures—that anarchists build. They have such modest resources, usually, and they stretch them far. They could do so much more with more resources. But then they would have to learn to scale. And they’d have to learn to deal with the responsibility of failure. If you’re just picking up already broken pieces, there’s such small harm you can do.

On the other end, our global infrastructures are in many ways the largest works of mutual aid ever devised. The systems involved are massive. The best of them have enabled a standard of living undreamt of in previous centuries. The sheer love put into them is astounding, and the care when they are well built shows deeply.

But these infrastructures have, through neglect, overuse, mis-design, shortcoming, or worse, working as intended, caused environmental catastrophe that affects and will affect our world for a long long time, with the greatest burden on the people least legible to societies, least able to speak for themselves in systems of power, the least represented. The people most in need of mutual aid right now.

I wish more of the people building those policy works had lived homeless — not that I wish homelessness upon them, but I wish they understood, viscerally. I wish they understood the burdens of racism, colonialism, ableism, and capitalism, in the day to day way it plays out world over. The best of them do understand. Those are my people.

These groups mix poorly. Anarchists habitually reject anyone with access to the system as privileged, or worse, a liberal.

The people who maintain our infrastructure, social, legal and engineering, get exhausted trying to correct takes by people with no inside understanding of the systems. And who want to throw away the systems and start over, despite them holding the lives of billions in the balance.

Watch out for the people who do manage to hold all of this in their heads, the dissonances and conflicts and really try to see, really try to help shape the world to be better. They have some success at it.

Reading Week Ending 2/4

Yes, reading includes watching videos sometimes.

[…]suggests something of what it has truly meant, over the centuries, for people to read. This is all about paying attention, listening to what others (and not only human others) have to tell, and being advised by it. In Old English, the word ‘read’ originally meant to observe, to take counsel, and to deliberate (Howe 1992). One who has done so is consequently ‘ready’ for the tasks ahead.

On Being Tasked with the Problem of Inhabiting the Page, via Ruth Malan, referencing Nicholas Howe, “The Cultural Construction of Reading in Anglo-Saxon England” in The Ethnography of Reading (1992),

Reading January 2024

I’m off my regular routine due to travel, so I’m sparse, but here’s some interesting bits.

Reading Week Ending 12/30

Things I’ve been reading this week:

  • “Facts, frames, and (mis)interpretations: Understanding rumors as collective sensemaking” by Kate Starbird at Center for an Informed Public @ UW. An excellent article on how we make sense of evidence and rumors and how disinformation works collectively
  • “Value Capture” by C Thi Nguyen. An excellent philosophy paper sussing out a concept I think people would be well to do to think about explicitly.
  • “Presentation, Diagnosis, and Management of Mast Cell Activation Syndrome” My husband has MCAS, and I suspect a really significant number of people do too. The literature paints it either as very rare … or about 1/5 of the population, depending on how you set thresholds and understand the disease. It’s easy to over-fit to, so it’s worth reading skeptically, but man do I ever know a bunch of vaguely but seriously chronically ill people who fit this mold. It seems to affect autistic people and people with ADHD more than most? It might be part of the bendy-spoony-transy syndrome.
  • “The C4 model for visualising software architecture”
  • “Untangling Threads” by Erin Kissane. Thoughtful work as always, and well worth a read in thinking about how federated social media should work when Meta the company starts getting involved.
  • “The Dark Lord’s Daughter” by Patricia C Wrede. A cute fantasy book so far. Aimed at kids but she was a favorite author of mine as a kid, and now is a favorite writer of writing advice now as an adult. I like to keep tabs on what she’s up to.
  • “Hella” by David Gerrold, for a book club. Fun so far. Space colonization on a planet with dinosaurs, with an autistic protagonist.

On queerness and representation

A writer friend of mine wrote a pretty good essay during his July blog-post-a-day ambitions about queer protagonists. It’s a quick read, and he does a pretty good job of answering “why so many now?”: representation matters. After so much exclusion with men, usually white, at the helm of every industry — this includes the commercial arts — there has been a moment in the sun for queer writers, and so many of us have been honing our craft on fan fiction, much of it exceptional, and now bursting out, refulgent into an industry which while it still centers those who are white, straight, and men, has given us enough space to at least be visible and successful for the time being. The wheels of justice, righteousness, and recompense have aligned for the time being, too little too late, but still: we’re here.

They write:

Genre fiction has always been where societal boundaries are stress tested first. Genre fiction is where progressive voices get to practice. When the stories are exploring what could have been or what might be, sometimes the narrative dives straight into what should be.

Presently, there should be more queer protagonists. There should be more queer writers, writing queer protagonists, celebrated by audiences, queer or otherwise.

It’s not lost on me that we get lumped into ‘progressive voices’ — and we are — but we’ve been here for a very very long time. We’re dissenting voices, hidden voices, erased voices, progressive voices, voices of people stuck in a conflict that has moved on without us, voices of the long-marginalized. All of these are long standing social processes, not a new phenomenon, a new frontier being carved out suddenly. We’ve always been here. Fan fiction itself, the refuge of writers creating what the main stream will not give them, has much of its current structure from the idea of ‘slash fiction’ (gay pairings) that came about specifically in Star Trek fan fiction, mostly from women and queer writers.

There should be more queer protagonists: when I was growing up, it was said that 2-4% of us are queer; I heard some people say 10% and at the time that sounded overstated, but now I’m convinced that’s deeply underestimated and truth be told, as I come to understand the processes of queerness, sexual attraction, identity formation, oppression, and marginalization, it’s now my habit to see not the proportions as some fixed number, but the result of processes of how we, collectively, conceive of ourselves. If more than two thirds of us can figure this stuff out by the messy process of living it, a writer can figure it out by listening. These are dynamic systems, and with the increased visibility, whole new groups of people come to understand themselves new ways. And this is good. It’s an alternative to the ugly truths about how we have conceived of ourselves before: whiteness was created to justify slavery. Straightness was created to reinforce ideas of family that support systems like capitalism and corporate dominance. These aren’t neutral defaults, but evolved systems that benefit people.

But more than that: if genre fiction is the place of imagining a new future or alternate past, queerness is itself a subject for genre fiction. It is the place we imagine new ways of being. It does a disservice to the idea of genre fiction to rope off some pieces as a do not go zone. We are, in fact, the sort of people who figure this stuff out, repeatedly, for character after character. We must open our future and look at it honestly.

There are lived experiences that I cannot claim, experiences that many queer readers would expect from a story that is meant to speak to and represent them. It would be wrong of me to try and write a queer story. There are other writers that can write that, and we should make sure there is room for them to do so.

They’re not wrong about the last part — we have been denied too long, and the room to do so is much needed — but I want to challenge this: like anything else in a market, often it’s not as simple as competition for a place, but instead, good stories in conversation with each other and new entries and aspects of these things create markets, and expand both access and success of all within.

It’s certainly true that straight people, almost entirely white and men, have dominated the industry, and stand an easier chance of being published than their peers who are not. But at the same time, it’s also a matter of lifting each other up. It’s not writers who are in the way, it’s publishers and the power structure that filters so terribly. That’s the place to fight: with success and publication, we get the opportunity to recommend and include others. We lift each other up. There’s a tendency to gate-keep, especially when we feel like we are spending our reputation to uplift others. That follows from the nature of the industry, but we can upend it. Instead of looking to the power brokers, the decision makers for what’s good, we can listen to each other, and the marginalized among us for the stories that aren’t being told, aren’t being published, and we can both write them and bring the authors already doing so into the light.

I can include queer characters in my stories, though. My main character can be queer, as long as I don’t make that the focus of the story. Some folks are gay. Some folks have dark hair. Some folks have gluten allergies. These are descriptors, and not necessarily character defining traits.

It can be a little confusing when a story is appropriation, and when it is representation. When in doubt, there are readers that can provide feedback and help the writer keep from doing harm with their stories. Misrepresentation and stereotyping can be extremely painful and continue a cycle that oppresses or mischaracterizes people that are already not well represented. So, hire a sensitivity reader, and listen to them if they tell you that you’re doing harm.

It’s not wrong advice in the slightest: if you’re not of the group, you’ll rely on the relationships with people who are for sensitivity. Hire a sensitivity reader, pay them well, and listen to what they have to say. But so too, a sensitivity reader can’t represent a whole community with its diversity of opinions. We have to go deeper. We have to cultivate a plurality of relationships. Listen, but also listen to the theory behind what they’re saying.

But here’s my challenge. Write the story with the queer main character where that deeply defines their life. I don’t mean necessarily a coming-out story, or a story wallowing in the oppression, but it’s okay — and I’d argue necessary — to do the work to really understand what makes us who we are to write good genre fiction.

Some of us are gluten-sensitive, and it’s just a trait that adds a bit of complexity. Sometimes it’s a thing that took decades of our life to chronic illness, defined our relationship with our families, the medical establishment, the very idea of work. So too with queerness: it’s not always flavor text, a bit thrown on top to give a bit of diversity to otherwise straight characters. In many ways, the approach of not letting queerness be a character-defining trait is itself a kind of tokenization: you can have a queer character if they’re not too queer. You can’t be progressive if it doesn’t upset the status quo.

Stories upset the status quo, out of necessity. Genre stories of often upset the whole status quo, the very ideas that our world is built on. That’s what makes them great.

Make no mistake here: I’m not saying that if you’re straight you shouldn’t write queer characters, if you’re white you shouldn’t write racialized characters. But it does mean we need to learn, to listen, to understand and be clever. We need to both extrapolate from the information we do have, but also listen to those unlike us for the information we don’t have. You don’t just have to listen to your sensitivity readers (though you’d do well to do so!), you have to listen to the world around you, for the things that challenge the very ideas of how you think things are. As a writer you’ll grow from this. We can grow beyond the fear of doing harm and into a well forged alliance of authors supporting each other, uplifting the more marginalized among us, sharing and understanding their stories not just when they’re written for us but when they arrive in their full complexity in a world that may not be ready for them. We need to cite our sources for some of our ideas. Two takes on the same thing uplift each other, and if we find ours takes space from the other, we should uplift the other, not shrink to the shadows, hiding the much-needed idea from the world.

Writing about queerness feels like an expanse of shifting terms, pitfalls under mundane seeming appearances, but that too is an experience of queerness. In my own lifetime, the word you’d use to refer to someone like me has changed not once, twice, but three times. That hesitation and discomfort, that desire to get it right and play it safe is one of the forces acting on queer people too.

Write the queer main character, but be prepared for the learning that will happen, both in the criticism but even more deeply in the introspection. Queerness is on one hand a mere fact of life for some people, but in another, a foundational relationship to the world — not always friendly, but sometimes it is, an in either case, can affect us to our core. Being non-white too is an experience of marginalization, but it is also a natural joy to exist in a skin and family and community that is very much who one is, inescapable. As a white writer, we will find both the marginalization and the joy uncomfortable. As a straight writer, likely the same for queerness. The discomfort will reveal stories you thought you could never tell, and if you nail a story, really seeing an aspect of the experience, that enriches us all, proving that we really can understand each other.