The Agentic Operating System — workshop build
A live multi-agent ops shell, designed and built with 40 engineers in one room.
AIMED · public workshop · ~40 engineers
What the team was actually solving
Most teams meeting agentic AI for the first time get stuck on one of three blockers: tool design, orchestration choice, and the gap between a working demo and a system that survives Monday morning. The AIMED workshop format compresses the answers into one day of hands-on building.
Where the old process broke
- 1Engineers had used Claude and Copilot but never built an agent loop end-to-end
- 2Multi-agent orchestration is widely-discussed but rarely walked through in code
- 3Workshops that just demo finished agents leave attendees without the muscle memory to ship their own
- 4A real artefact that survives the day is the difference between a workshop and a memorable workshop
The AI / technical solution we shipped
A day-long live build of "the Agentic Operating System" — a multi-agent shell with a supervisor (planning, decomposition), handoff agents (parallel reads, sequenced writes), shared tool registry via MCP, and observability wired in from line one. Every attendee leaves with a running shell on their own laptop, the source, and the patterns to extend it.
How the system is wired
The specialist roles
Workshop stack — same primitives every attendee built on
How attendees integrated the shell into their own work
The shell is generic — file reads, shell commands, web search. By the afternoon, attendees were forking it to wire in their own tools: GitHub API tools, Jira tools, internal codebase searchers, and so on.
- Files MCP: read/write scoped to a workshop sandbox dir
- Shell MCP: scoped to a small allowlist of commands
- Search MCP: web + GitHub code search wrapped behind one tool
- Each attendee added one tool of their own by the end of the day
Patterns we drilled (production-grade, not workshop-grade)
Scope tools, not models
The shell's tool registry is what we tightened, not the model prompts. Lesson: the model is the cheap part to swap; the tools are the contract.
Audit-first design
Every tool call traces. Every cost attributes. Workshop or not, this is the only way to debug multi-agent.
Cache the prefix
Stable system prompt + tool registry caches across calls. Workshop attendees saw the cache-hit rate climb live as we built.
Fail loudly
Validation failures escalate to the supervisor. Silent fallback is what makes agents flaky.
Workshop format
Hour 1 — Setup
uv venv, Anthropic SDK, repo cloned. Every laptop calling Claude before coffee.
Hour 2–3 — First agent
Single-agent loop with a single tool. Hello-world for tool calling.
Hour 4 — MCP
Wrap the file + shell tools as MCP servers. First reusable abstraction.
Hour 5–6 — Supervisor + handoff
The shell takes shape. Multi-agent orchestration in code attendees can read.
Hour 7 — Observability
Wire Langfuse. Per-step trace becomes the debugger of choice for the rest of the day.
Hour 8 — Add-your-own-tool
Attendees fork. Forty different production-shaped extensions in one room.
Observability from line one
The single most-fed-back lesson from the day was that wiring Langfuse before the first multi-agent call (not after) changed how attendees debugged. Once the per-step trace was visible, problems that took minutes to find took seconds.
- Per-step trace with the workshop attendee's name as session prefix
- Cost-per-attendee surface — running cost visible all day
- Cache-hit rate climbed from 0% at hour 1 to ~72% by hour 6 as the system prompt stabilised
What the attendees walked away with
A running shell on their own laptop. The source on GitHub. Three working MCP servers. A supervisor + handoff pattern they can extend. And — most valuable — the muscle memory of having built this end-to-end.
What we'd tell another team building this
- 01Workshop attendees retain ten times more from one full build than from ten finished demos. Build it together; do not show it pre-built.
- 02Langfuse-from-line-one was the highest-leverage decision in the curriculum. Without per-step trace, hour 5 onwards becomes guesswork; with it, debugging is direct.
- 03Three specialists with narrow scopes beat one big agent loaded with everything — confirmed live by an A/B at hour 6, with the audience watching the eval-score deltas.
- 04The "add-your-own-tool" hour is what makes the workshop generative. The variety of extensions in one room is what the attendees remember three months later.
How the workshop pattern extends
The Agentic OS shell is now the curriculum spine of the corporate training programs we deliver. Customers come in with their own integrations to bolt on, and the same 8-hour structure compresses neatly into one full day of an 8-day intensive.
- Same pattern delivered as Day 5 of the corporate 8-day intensive (IT track)
- The Agentic OS GitHub repo is the public reference implementation
- Workshop graduates often return with their own MCP servers and extensions to share
Want this workshop for your team?
The same format runs on-site for engineering teams of 8–14. We customise the integrations to your stack — by the end of the day your team has shipped a working agentic shell in your codebase.
Read what we publish on this
Why I am replacing supervisor patterns with handoffs
Supervisors looked clean on paper and shipped slow in production. Handoffs read messier in the code but recover better when an agent loses the plot. Two real systems and where supervisors still earn their keep.
Read the post MCPWhy every team's first MCP server should be "list-files"
Smallest useful server. Hardest one to mess up. Teaches the protocol without distracting domain logic. The 60-line server we hand to teams during training.
Read the post ProductionThe agent observability stack we ship to every client
Traces, spans, evals, cost-per-completed-task, and the one dashboard panel that catches 80% of regressions. Vendor-agnostic — covers Langfuse, Honeycomb, and rolling your own.
Read the postSolutions & topics worth reading next
Agentic AI Consulting
Designed, built, and handed off — production agentic systems for enterprise teams.
MCP Integration
Custom Model Context Protocol servers that turn your systems into agent tools.
AI Systems Engineering Training
Eight-day corporate training programs that take dev teams from AI-assisted coding to production agentic systems.
Multi-Agent Workflows
Supervisor + handoff orchestration for portfolios of agents that need to cooperate without arguing.
Agentic AI
Designing, building, and shipping production agents.
Model Context Protocol (MCP)
The open protocol that gives agents tools.
Multi-Agent Systems
Orchestrating many agents without losing the plot.
AI Engineering
The discipline of shipping AI systems, not demos.
More implementation proof
Multi-agent research synthesis — open PoC for swarm vs supervisor
An open-source experiment comparing orchestration patterns on a real research task.
Read this case study Enterprise EngagementPR review pipeline cuts senior-engineer time 4×
Multi-agent CI workflow for a 180-engineer monorepo.
Read this case study