Running AI coding agents one at a time does not scale. Running them in parallel without controls produces output nobody can trust.
Abracapocus is built to solve both problems at once: concurrent execution under enforced safety constraints, with verifiable evidence attached to every task.
This is how it works — and why each part matters.
Asynchronous Execution
-
Parallel execution inside a phase. Independent tasks run concurrently, so elapsed time scales with the width of the work rather than the number of model calls.
-
Dependency-based ordering. Tasks wait for verified outputs from prerequisite tasks before they begin. Operators declare the dependencies; the runtime enforces them.
-
Automatic serialization for shared resources. Tasks that would collide on the same resource execute in sequence. Concurrency limits are configurable by resource type.
-
Start-and-poll execution over MCP. Long-running tasks return immediately with an identifier. Operators — human or agent — poll durable state instead of holding open connections.
-
Classified timeouts. A timeout with no changes counts as no progress and does not consume retry budget. A timeout with partial changes preserves those changes for verification.
-
Clean cancellation. Terminating a run kills the full process tree, preserves partial evidence, and records the run as interrupted rather than failed.
Architectural Consequences
-
Scheduling stays declarative. Concurrency rules live inside task contracts instead of a separate orchestration layer. There is no external DAG engine to maintain.
-
Evidence is part of execution. Every task produces a durable record: what it attempted, what changed, what verification found, and why the task passed or failed.
-
Phase advancement depends on evidence. A phase advances only when its evidence satisfies a configurable acceptance rule. The default is simple: every task accepted, none blocked.
-
Contracts fail early. Task contracts are statically validated before any backend executes. Invalid contracts fail at the cheapest possible point.
-
Model tier follows task size. Small, bounded tasks run on cheaper models. Expensive models are reserved for work that genuinely requires them.
-
Retries stay narrow. A blocked task isolates failure to its own execution path instead of collapsing the entire phase. Retries reuse the original evidence as context.
-
Evidence survives interruption. Timed-out and cancelled executions still leave behind structured records instead of disappearing into terminal logs.
-
Audit export is unified. Run reports, task evidence, and event streams export together as a single artifact rather than being reconstructed from scattered logs.
What This Makes Possible
-
Parallel AI coding work that can run unattended within declared safety boundaries.
-
Lower wall-clock time without lower accountability.
-
Greater use of cheaper models because the work is shaped to fit them.
-
Audit trails generated automatically as a side effect of execution.
-
The same operational surface for a human operator or an AI agent acting on the operator’s behalf.
The Underlying Principle
Most systems make AI coding more reliable by paying for reliability directly: containers for every task, premium models for every call, custom retry logic for every failure mode.
Abracapocus pushes that effort into the structure of the work instead.
Well-shaped task contracts let smaller models succeed. Non-overlapping outputs remove much of the need for container isolation. Structured evidence turns retries from judgment calls into mechanical decisions.
The system scales on the design of the work — which is the dimension that actually scales.