Workshop BuildWorkshop / Public Build May 20261 day · 8 hours hands-on

The Agentic Operating System — workshop build

A live multi-agent ops shell, designed and built with 40 engineers in one room.

AIMED · public workshop · ~40 engineers

Multi-Agent SystemsAI Workflow OrchestrationMCP IntegrationsAI Engineering Architecture
40
engineers shipped a running multi-agent shell on their own laptops
3
MCP servers per attendee, written from scratch
8 hrs
concept to working artefact
Business problem

What the team was actually solving

Most teams meeting agentic AI for the first time get stuck on one of three blockers: tool design, orchestration choice, and the gap between a working demo and a system that survives Monday morning. The AIMED workshop format compresses the answers into one day of hands-on building.

Existing workflow

Where the old process broke

  • 1Engineers had used Claude and Copilot but never built an agent loop end-to-end
  • 2Multi-agent orchestration is widely-discussed but rarely walked through in code
  • 3Workshops that just demo finished agents leave attendees without the muscle memory to ship their own
  • 4A real artefact that survives the day is the difference between a workshop and a memorable workshop
Proposed solution

The AI / technical solution we shipped

A day-long live build of "the Agentic Operating System" — a multi-agent shell with a supervisor (planning, decomposition), handoff agents (parallel reads, sequenced writes), shared tool registry via MCP, and observability wired in from line one. Every attendee leaves with a running shell on their own laptop, the source, and the patterns to extend it.

System architecture

How the system is wired

Agentic OS — the shell we built live
User intentnatural languageSupervisorplan · decomposeHandoffsspecialists runIntegratormerge · validateActiontool · response
Agent workflow

The specialist roles

The three specialists
Readerfiles · search · webPlannerdecompositionDoershell · write · commit
Technology

Workshop stack — same primitives every attendee built on

Reasoning modelsClaude Sonnet 4.6 + Haiku 4.5 (free Claude Code tier worked)Tool layerThree MCP servers built from scratch: files · shell · searchOrchestrationPython supervisor + handoff context passingObservabilityLangfuse traces from the first agent callRuntimeEach laptop — Python 3.12 + uv + the cloned workshop repo
Integration

How attendees integrated the shell into their own work

The shell is generic — file reads, shell commands, web search. By the afternoon, attendees were forking it to wire in their own tools: GitHub API tools, Jira tools, internal codebase searchers, and so on.

  • Files MCP: read/write scoped to a workshop sandbox dir
  • Shell MCP: scoped to a small allowlist of commands
  • Search MCP: web + GitHub code search wrapped behind one tool
  • Each attendee added one tool of their own by the end of the day
Security & scalability

Patterns we drilled (production-grade, not workshop-grade)

Scope tools, not models

The shell's tool registry is what we tightened, not the model prompts. Lesson: the model is the cheap part to swap; the tools are the contract.

Audit-first design

Every tool call traces. Every cost attributes. Workshop or not, this is the only way to debug multi-agent.

Cache the prefix

Stable system prompt + tool registry caches across calls. Workshop attendees saw the cache-hit rate climb live as we built.

Fail loudly

Validation failures escalate to the supervisor. Silent fallback is what makes agents flaky.

Methodology

Workshop format

01

Hour 1 — Setup

uv venv, Anthropic SDK, repo cloned. Every laptop calling Claude before coffee.

02

Hour 2–3 — First agent

Single-agent loop with a single tool. Hello-world for tool calling.

03

Hour 4 — MCP

Wrap the file + shell tools as MCP servers. First reusable abstraction.

04

Hour 5–6 — Supervisor + handoff

The shell takes shape. Multi-agent orchestration in code attendees can read.

05

Hour 7 — Observability

Wire Langfuse. Per-step trace becomes the debugger of choice for the rest of the day.

06

Hour 8 — Add-your-own-tool

Attendees fork. Forty different production-shaped extensions in one room.

Observability

Observability from line one

The single most-fed-back lesson from the day was that wiring Langfuse before the first multi-agent call (not after) changed how attendees debugged. Once the per-step trace was visible, problems that took minutes to find took seconds.

  • Per-step trace with the workshop attendee's name as session prefix
  • Cost-per-attendee surface — running cost visible all day
  • Cache-hit rate climbed from 0% at hour 1 to ~72% by hour 6 as the system prompt stabilised
Automation impact

What the attendees walked away with

A running shell on their own laptop. The source on GitHub. Three working MCP servers. A supervisor + handoff pattern they can extend. And — most valuable — the muscle memory of having built this end-to-end.

40
engineers shipped a running multi-agent shell on their own laptops
3
MCP servers per attendee, written from scratch
8 hrs
concept to working artefact
Lessons learned

What we'd tell another team building this

  • 01Workshop attendees retain ten times more from one full build than from ten finished demos. Build it together; do not show it pre-built.
  • 02Langfuse-from-line-one was the highest-leverage decision in the curriculum. Without per-step trace, hour 5 onwards becomes guesswork; with it, debugging is direct.
  • 03Three specialists with narrow scopes beat one big agent loaded with everything — confirmed live by an A/B at hour 6, with the audience watching the eval-score deltas.
  • 04The "add-your-own-tool" hour is what makes the workshop generative. The variety of extensions in one room is what the attendees remember three months later.
What's next

How the workshop pattern extends

The Agentic OS shell is now the curriculum spine of the corporate training programs we deliver. Customers come in with their own integrations to bolt on, and the same 8-hour structure compresses neatly into one full day of an 8-day intensive.

  • Same pattern delivered as Day 5 of the corporate 8-day intensive (IT track)
  • The Agentic OS GitHub repo is the public reference implementation
  • Workshop graduates often return with their own MCP servers and extensions to share

Want this workshop for your team?

The same format runs on-site for engineering teams of 8–14. We customise the integrations to your stack — by the end of the day your team has shipped a working agentic shell in your codebase.