Topic Pillar

Agentic AI.Systems that decide, act, and recover — built to ship.

Agentic AI is the next layer above generative AI: systems that pick tools, take actions, and recover from failure without a human in the loop on every step. This hub gathers the architecture patterns, cost-control techniques, and production lessons that make agents work outside the demo.

35 cluster pages· 16 posts· 3 notes· 14 updates· 2 events

What "agentic" actually means

An agent is not a chatbot with extra prompts. It is a system that picks a tool, takes an action against the real world, observes the result, and decides what to do next — with exit conditions, retries, and a budget. The difference between a prototype and a production agent is almost entirely in the boring parts: scope, observability, evaluation, and tool design.

When agents are the right tool

Use an agent when the work is multi-step, the steps depend on the previous outputs, and at least one step needs reasoning that scripts cannot encode. Skip agents for deterministic ETL, single-call classification, and any pipeline that runs the same five steps every time — those are scripts, and scripts are cheaper and more reliable.

The patterns that actually work in production

Pre-agentic data fetching, supervisor-vs-handoff orchestration, descriptive tool names, "when to use" descriptions on every tool, exit conditions on every loop, prompt caching as a first-class metric, evaluation datasets that go beyond the happy path, observability per step. These are the patterns we drill in training and ship in consulting.

16 blog posts

Deep dives on Agentic AI

Tool Design

Tool descriptions are prompts. Fix the registry, not the agent.

When an agent picks the wrong tool, the registry is broken — not the agent. Three rules I now apply before debugging anything in a multi-tool system: precise names, "when to use" triggers, and a curated load list. Anthropic's new tool-selection telemetry finally puts numbers on what changes accuracy.

May 13, 20265 min
Read the post
Production

The cheapest LLM call is the one you do not make — GitHub's 19-62% token cut, decoded

GitHub published an instrumented analysis of their agentic CI workflows and reported 19-62% token-cost reductions. The savings are the headline. The technique — pre-agentic data fetching and tool-registry hygiene — is the story most teams will miss.

May 11, 20265 min
Read the post
Architecture

Claude Opus 4.7's 1M context: when to RAG and when to just stuff it

A million tokens reliably is real now, but it does not retire RAG — it changes the calculus. Cost, latency, recency, and the prompt-cache angle nobody is talking about.

May 6, 20266 min
Read the post
MCP

MCP 1.0 is here. What changes for the servers you already wrote

The protocol stabilised. Most working servers will keep working. Three places the new spec actually requires changes — auth profile, server registry, streaming-response semantics — with diffs from a real migration.

May 1, 20265 min
Read the post
Multi-Agent

Why I am replacing supervisor patterns with handoffs

Supervisors looked clean on paper and shipped slow in production. Handoffs read messier in the code but recover better when an agent loses the plot. Two real systems and where supervisors still earn their keep.

Apr 26, 20266 min
Read the post
Production

Prompt caching is not optional anymore — measuring a 47% cost drop

A walkthrough from a client engagement: identifying stable prefixes, restructuring the system prompt for cacheability, and the telemetry that proved caching was actually working.

Apr 19, 20264 min
Read the post
Tool Design

Tool descriptions are prompts. Stop treating them like docstrings

A docstring tells a developer what a function does. A tool description tells a model when to call it. Different audience, different writing. Six concrete edits that lifted tool-call accuracy.

Apr 8, 20264 min
Read the post
Production

The agent observability stack we ship to every client

Traces, spans, evals, cost-per-completed-task, and the one dashboard panel that catches 80% of regressions. Vendor-agnostic — covers Langfuse, Honeycomb, and rolling your own.

Mar 28, 20267 min
Read the post
Architecture

Three patterns I broke in 2025 — and what I do instead now

Self-correction loops without budgets, single-agent solutions to multi-domain problems, and using JSON mode to force structure I should have built into the schema. An honest review.

Mar 14, 20268 min
Read the post
Multi-Agent

Haiku 4.5 made our router 5x cheaper. The trade-off matters

Replacing Sonnet with Haiku in the dispatcher role cut our orchestration cost dramatically. It also cost us in two specific places I did not predict.

Feb 22, 20265 min
Read the post
MCP

Why every team's first MCP server should be "list-files"

Smallest useful server. Hardest one to mess up. Teaches the protocol without distracting domain logic. The 60-line server we hand to teams during training.

Feb 4, 20263 min
Read the post
Production

Eval datasets: stop testing your agents on the happy path

If your eval set is the demos you showed the client, you are testing the wrong thing. How we build evals from production failures and the minimum viable suite to ship.

Jan 19, 20266 min
Read the post
Prompt Engineering

I was wrong about JSON mode. Here is what changed my mind

For two years I told teams to avoid forced JSON outputs and use structured tool calls. That was right then and partially wrong now — schema enforcement got better, latency penalties got smaller.

Dec 12, 20254 min
Read the post
Architecture

Why your agent keeps failing after 3 steps

The exit condition problem nobody talks about. Most agents are built for the happy path — where every tool call succeeds and the task completes cleanly. Real production agents are different.

Nov 8, 20254 min
Read the post
Tool Design

The one rule for designing agent tools that actually work

One tool, one purpose. Every tool that does two things will fail you on the third call. I have watched this pattern fail in every team I have trained — and the fix is the same refactor.

Oct 17, 20253 min
Read the post
Architecture

RAG vs CAG: how to actually decide

A decision framework from real implementations. RAG retrieves. CAG stores in cache. Knowing which to use — and when to combine both — determines whether your agent finds the right answer at the right cost.

Sep 21, 20255 min
Read the post
14 ship-news updates

Latest in Agentic AI

Claude

Anthropic ships tool-use telemetry — every selection is scored and logged at the model boundary

May 13, 2026 · via Anthropic
Tools

Claude Code adds parallel sub-agent execution — multi-file refactors land in a single turn

May 13, 2026 · via Anthropic
MCP

MCP remote-server registry crosses 500 listed servers — a curated production-ready tier emerges

May 12, 2026 · via modelcontextprotocol.io
Architecture

GitHub cuts agentic CI workflow costs 19-62% by pruning tools and moving data-fetch outside the LLM loop

May 11, 2026 · via GitHub Engineering Blog
Claude

Claude Opus 4.7 ships with 1M-token context window in production

May 7, 2026 · via Anthropic
Tools

Claude Code adds project memory — persistent context that survives across CLI sessions

May 5, 2026 · via Anthropic
MCP

MCP 1.0 ratified — official SDKs in Python, TypeScript, Go, Rust, Java, .NET

May 2, 2026 · via modelcontextprotocol.io
Architecture

Anthropic publishes "Effective Tool Design" — official guidance for production agents

Apr 28, 2026 · via Anthropic
Claude

Sonnet 4.6 update: cheaper tokens, sharper tool calls, fewer retry loops

Apr 24, 2026 · via Anthropic
Enterprise solutions

How Agentic AI ships in our engagements

The pages below are the buyer-focused, conversion-grade versions of this topic — deliverables, methodology, ROI, security considerations, and CTAs to scope a real engagement.

Solution

Agentic AI Consulting

Designed, built, and handed off — production agentic systems for enterprise teams.

Explore the Agentic AI Consulting solution
Solution

MCP Integration

Custom Model Context Protocol servers that turn your systems into agent tools.

Explore the MCP Integration solution
Solution

AI Guardrails

Multi-layer safety, policy, and audit controls for agents in regulated environments.

Explore the AI Guardrails solution
Solution

AI Systems Engineering Training

Eight-day corporate training programs that take dev teams from AI-assisted coding to production agentic systems.

Explore the AI Systems Engineering Training solution
Solution

Enterprise AI Architecture

Reference architectures for organisations standing up an AI platform — not one agent, but the foundation for many.

Explore the Enterprise AI Architecture solution
Solution

AI Observability

Tracing, eval, cache-hit telemetry, and cost attribution for production agents.

Explore the AI Observability solution
Solution

Multi-Agent Workflows

Supervisor + handoff orchestration for portfolios of agents that need to cooperate without arguing.

Explore the Multi-Agent Workflows solution
Solution

AI Automation for Enterprises

Operational agents that replace manual workflows — triage, support, ERP integration, content pipelines.

Explore the AI Automation for Enterprises solution
Frequently asked

Agentic AI — the questions teams actually ask