The Setup
We sell autonomous AI agents. So we built one to ship our own engineering.
Outermind's pipeline is the same idea we sell to customers, pointed inward: an AI workforce that turns business intent into shipped product. Instead of replying to email or running operations, our internal agent does software engineering. It watches GitHub for new issues, picks them up, plans the change, writes the code, runs the tests, opens the pull request, and merges itself in when the reviews pass. A human sets direction and steps in for judgment calls. The agent does the rest.
This page is a tour of that system, with live numbers pulled from our own commit history.
The Numbers
In the last 30 days (window: 2026-04-09 to 2026-05-09), our autonomous build pipeline:
- Merged 200 pull requests to the integration branch.
- 53% of those PRs came from
claude/issue-*branches, the agent's signature pattern for one-issue-one-branch work. - Posted a 1% revert rate (2 reverts across 200 merges).
- Added 465,553 lines and removed 117,078 lines of code.
- Shipped a median PR size of 759 lines, small enough to review at a glance.
These numbers regenerate every time the marketing site builds. No spreadsheet, no quarterly snapshot. The narrative is allowed to age; the numbers are not.
The Architecture
The pipeline is a state machine that walks every issue from "filed" to "merged":
- Triage. A new GitHub issue is fetched, classified, and routed. The router decides which downstream skills should run and which can be skipped.
- Plan. An agent reads the issue, scans the relevant code, and writes an implementation plan against the live spec tree.
- Implement. Another agent does the work in an isolated git worktree: edits, builds, tests, and commits. It cannot touch any branch outside its assigned worktree.
- Verify. The CI runner replays the same build, lint, type-check, and test commands the agent ran locally. Failures bounce back into a fix stage.
- Review. Code review, security review, spec drift, and docs drift are each their own stage with skip and full-pass variants. Trivial diffs skip; risky diffs get the full team.
- Merge. The agent opens the PR, waits for green, and admin-merges into the daemon integration branch. The daemon branch promotes to qa on a schedule, and qa promotes to main behind a human gate.
Every stage writes a structured summary to .tmp/. The next stage reads it. Nothing is implicit.
The Skill Catalog
The agent's behavior is not a single mega-prompt. It is a catalog of small, named skills that load on demand based on the work in front of it.
- Stage skills cover plan, implement, code review, security review, spec update, docs update, public-site update, e2e testing, migration check, and fix. Each stage has variants: skip, minimal, standard, agent-team, single-agent, comment-doc-only, inline-review, pipeline-code. The variants are how we trade speed for thoroughness without rewriting the whole pipeline. Today the catalog runs 21 stage-skill variants across 10 pipeline stages.
- Orchestration commands sit above the stages and run the daemon itself:
launch,start,dashboard,retry,stop. Five commands keep the whole system running. - Domain skills layer on top: tenant security audits, database migration conventions, navigation conventions, regional deployment, and roughly seventy more, each tuned to a specific corner of the codebase.
A skill is a markdown file with rules and a description. The agent picks the right one based on what it sees in the diff. New skills land like normal code, with PRs and reviews. The catalog grows as the codebase does.
What This Unlocks
The pipeline is the reason a small team ships like a much larger one. Every part of it compounds.
- The skill catalog is opinionated and codebase-specific. It encodes mistakes we already made and feedback we already absorbed, so the agent gets sharper every week without anyone retraining a model.
- The state machine is reified in code: queue, poller, daemon, dashboard, ci-runner, github-client. Issues flow through it on their own, around the clock, while we sleep.
- The verification harness is the same code path that runs in CI. There is no drift between "the agent thinks it passed" and "CI thinks it passed", which is why we trust the pipeline to merge its own work.
- The git history is a public artifact of the system working. The numbers above are not a slide deck; they are the repository talking.
The result is leverage. A handful of people set direction, review the interesting calls, and let the pipeline carry the rest. Features that would normally need a roadmap quarter land in days. We get to spend our attention on the parts of the product only humans can decide.
Page generated 2026-05-09 from the live commit history of origin/daemon.