Status : ActiveLat 45.5231Lng 122.6765Ref BS_RECORD
2.0.26London / Remote
← The LedgerLEDGER-013
DeclassifiedClaude / CoworkExperimentan R&D experiment

A team of AI agents that runs a process on its own

An autonomous company, orchestrated by agents

Set a goal, and a 'company' of AI agents — each with a job — divides the work, hands it off between them, and produces the result with little steering from a person.

The problem

One AI agent can do a task — but getting a team of agents to run a whole process together tends to fall apart in practice.

The solution

A system that runs a 'company' of role-specific agents toward a goal, with the coordination logic that keeps them working together instead of drifting apart.

ClaudeMulti-agent orchestrationPython
BeforeAfter
Coordinating many agentsa research topic, impressive in demossomething you can actually run end to end
Keeping agents on taskthey drift, loop, or lose the goalcoordination logic holds the group to one objective
Human involvementconstant supervision, step by stepset the goal, let it run, check in
What it tells youa paper to read about what might be possiblea live measure of what agents can do right now
The delta

This is the experiment crossing from a demo you watch to a system you can run. The real question it probes: how far can a team of agents coordinate before a person has to step in? That line keeps moving as the underlying AI gets better — so the project doubles as a hands-on gauge of where this technology is actually heading. It is a working probe, not a finished product.

What I built

A system that runs an autonomous "company" of AI agents toward a single goal. Instead of one all-purpose assistant, it sets up a group of specialized agents — each handling one kind of work — and coordinates them like an org chart pointed at an objective.

  • Roles. Each agent has a defined job rather than trying to do everything. Splitting the work this way keeps any one agent from getting overloaded or confused.
  • Hand-offs. Work moves from agent to agent — the output of one becomes the input of the next — so a process can actually flow through the group.
  • Coordination logic. This is the part that keeps the group from falling apart: the rules that hold every agent to the same goal so the team doesn't go in circles or lose the thread. ("Multi-agent orchestration" is just the technical name for getting several agents to work together toward one outcome.)
  • Minimal steering. You set the objective and let it run, stepping in only where a person is genuinely needed rather than supervising every move.

It is a fork-and-extend experiment — I took an existing open project and pushed it toward the edge of what autonomous coordination can currently do. It is honest R&D, not a packaged product.

Why it matters

If a team of agents can reliably run a real process on its own, the leverage is enormous — you describe an outcome and a coordinated group of specialists carries it out. That is the promise this experiment is built to test, hands-on rather than in theory.

It is also a measuring stick. The amount of human steering this needs keeps dropping as the underlying models get more capable, so running it is a way to see exactly where the frontier of agent coordination sits right now — and where it is heading next. Knowing that firsthand, instead of guessing from a demo reel, is the whole point.

The hard part

The hard part of multi-agent work isn't getting one agent to do a task — it's getting a group of them to stay coordinated without unraveling. Left alone, agent teams tend to go in circles, second-guess each other, or quietly lose the goal a few hand-offs in. So the focus here is the coordination layer: the rules for who does what, how work passes between agents, and how the group stays pointed at one objective. The interesting and honest part is that the right amount of human steering keeps dropping as the models get better, so the experiment is really a way to keep measuring that frontier firsthand.

The bottom line

This is research you can run rather than just read — a working system that shows how far a team of agents can coordinate on its own right now. As the models improve, the amount of human steering it needs keeps falling, which makes it a useful early read on where agentic systems are heading.

Where the edges are

Still being worked out: how much of a real process a team of agents can carry start to finish before a person needs to take the wheel — the answer shifts every time the models improve.

Start a conversation