← Dev Log

Building an AI harness around your app. Releases got slower. Hotfixes nearly stopped.

AI game developmentbuilding a game with AIbrowser strategy gamesolo game developerCI release pipeline

TL;DR: Inselnova is my first game, a browser strategy game built solo with AI, three months live with real players. The AI is fast, and it’s just as fast at shipping a mistake. So the real work wasn’t getting it to write code. It was building a harness around it: a set of skills and gates every change has to pass before it reaches live. The harness grew one incident at a time. Something would break, I’d explain it to the AI, and the harness gained a gate so that class of problem couldn’t happen again. Getting to live now takes up to ten minutes instead of seconds. I almost never hotfix anymore. Here’s what’s in it.

My first game

Inselnova is my first game. I’ve written software for 30 years, built companies, run teams, but I’d never made a game. I built this one with AI to see how far the workflow goes.

It’s a browser strategy game. You settle an island, raid your neighbours, defend, queue a build, log out, come back later. Three months on it’s live, with more than 500 signups and dozens of players logging in every day. I review every change and type almost none of the code myself.

Inselnova works on mobile Inselnova works on mobile.

The harness, and why it exists

The AI is fast. Compressing months of work into days. The part often skipped is that it’s fast at shipping mistakes too. It will write a confident, well-structured change that quietly breaks something three systems away, and it will not flinch. The problems get worse over time.

So the interesting work wasn’t getting the AI to write code. It was building the harness around it. The harness is everything a change has to pass through before it reaches live: a set of skills, a stack of scripts, and a release pipeline with a gate at every step.

The harness grew one incident at a time. Early on, getting to live took seconds, and it showed. Something would break in production. I’d sit down, work out what happened, and explain it to the AI. Then, instead of just fixing that one bug, I’d have it add a gate so that whole class of problem could never reach live again.

One example. Early code used a database date function that worked fine in development, where tests run on SQLite for running tests quickly when developing in the cloud, and silently did the wrong thing on the live Postgres database. It hit me more than once. Now a check scans every change for that exact pattern and fails the build if it finds one. The bug isn’t allowed to exist anymore.

Multiply that by every mistake I’ve made in three months, and you get the harness.

The skills that catch a mistake before it ships

Three of the skills exist to stop a bad idea becoming code.

grill runs before I write anything. I load a ticket, explain what I want, and trigger it. It reads the relevant code first, then asks me up to five pointed questions, only the ones where the answer would actually change the implementation. It isn’t a survey. Every question comes with its recommendation already flagged. It ends by printing a short working agreement: the goal, the approach, what’s out of scope. Then it stops. Nothing gets built until the plan is explicit. I borrowed the idea from a “grill me” prompt pattern others were using and tightened it.

impact-check runs before any change to a game rule, a formula, a cost, a balance number. The AI is happy to halve grain production because I asked. impact-check makes it stop and surface everything that change touches first: the code that reads the value, the handbook pages that promised players a different number, the design docs the balance was built on, the tests pinning the old value. Then it asks me to confirm the trade-off out loud. The change that earned the skill looked like a one-line combat tweak, letting raids damage the main building. It would have compiled fine. It also would have quietly collapsed the distinction between a quick raid and the slow, costly way you take territory. The code was happy. The game would have been worse.

reconcile runs after. It audits the live game against the handbook, which is the source of truth for how Inselnova is meant to work. It catches both directions: code that drifted from what the docs promise, and docs that describe mechanics that never existed, like a flavour line mentioning cannons in a game that only has catapults. It won’t silently fix either side. It shows me the conflict and I decide which one is wrong.

The path to live

Once a change is built, it goes through a fixed path, and every step is a gate.

Nothing happens without a ticket. I load an issue from Helm, my own tracker, and the session anchors to it. Player bug reports drop straight into the same tracker. The harness refuses to commit work that isn’t tied to a ticket.

commit doesn’t just commit. It stages only the files from this session, never everything at once. It sorts the changes into changelog entries, with player-facing and technical notes kept separate. It runs the quality gate: type-checking, translation-key checks if any player-facing text moved, and a scoped set of backend tests. If anything fails, it stops. Then it writes the commit, tied to the ticket, and moves the ticket along the board.

release is the last and strictest gate. It won’t run unless a same-day review of the whole release has passed green. It validates the branch, bumps the version, promotes the changelog, and opens a pull request on GitHub. That pull request is the thing that finally triggers the full CI run.

CI runs on GitHub Actions, and it’s the part that takes the time. It type-checks every part of the codebase, runs a batch of static checks, works out which parts of the game the change put at risk and runs those test suites, runs the full backend test suite split across six parallel shards, builds every workspace, and runs the browser smoke tests twice over to catch flaky ones. Only when all of that is green does the release merge. Merging runs one more smoke pass against the main branch, and then it triggers the deploy on Render.

Slower to live, fewer hotfixes

Add it up and a change that used to reach live in seconds now takes up to ten minutes.

I think that’s the right trade, and it took me a while to believe it. Ten minutes feels slow when you’re used to instant. But speed-to-live was never the thing I should have been optimising. The thing that actually hurt was how often I had to drop everything and hotfix a live game with real players in it.

That number has fallen close to zero. The harness catches the mistakes now, at a gate, before any player sees them. Slower releases, far fewer fires, especially if it’s late.

What still isn’t working

The harness is the part I’m confident about. Two things I haven’t solved.

The first is reception. Post anything AI-built and someone will call it slop. It’s a fair default, a lot of AI-built games are thin. The only answer I’ve got is the game holding up once someone actually plays it. Getting them through the door is the hard part.

The second is a design call of my own. I sped the game up, because a slow early game loses first-day players, and that part worked. But the mid-game runs on other players, alliances forming, wars starting, someone to trade with and someone to fear. That needs a crowded map. With dozens of daily players the mid-game can feel empty. I sped up the pace before I had the population to fill it.

What the harness really is

Three months in, for a first game, it’s a real thing with real players.

The lesson I’d hand over is the boring one. The AI writing the code was never the hard part. Building the harness that stops it shipping a mistake was, and that harness is mostly a record of every bug I’ve already made, turned into a gate so I can’t make it twice.

If you’re building games with AI, I’d like to know what’s in your harness. What broke once, and what gate did you add so it couldn’t break again?

Build your own harness, and if you’re not sure how to, ask your AI agent. I don’t believe you can take something off the shelf, but you can ask your agent to find an existing skill to get inspired by it.

Inselnova on desktop and mobile The same world, on desktop and on your phone.

Inselnova is free to play if you want to see what comes out the other side: inselnova.com.