All posts
Production Published 13 min

Agentjacking is real: poisoned Sentry errors can hijack Cursor, Claude Code, and Codex without touching your repo

Tenet Threat Labs injected a fake stack trace through a public Sentry DSN and watched 100+ coding agents execute attacker commands during normal triage. No git write access required. The agent treats the error as ground truth. Here is how I harden observability MCP feeds, scope triage prompts, and block auto-exec on untrusted telemetry.

Jigar JoshiJigar JoshiAgentic AI Architect and Consultant
In this post (8 sections)

Introduction

On June 17, Tenet Threat Labs published Agentjacking: a demonstration that a fake Sentry error report can redirect coding agents into executing attacker shell commands. They tested Cursor, Claude Code, and Codex. More than 100 agents acted on injected errors in their lab setup. Roughly 85% of attempts succeeded. The attack needs no repository write access, no compromised dependency, and no prompt injection in your codebase. It only needs your agent to ingest observability output you told it to trust.

I have been saying since May that the agent supply chain is the attack surface. Agentjacking is the observability branch of that story. When an engineer asks an agent to "investigate this Sentry issue," the stack trace becomes instructions. If the stack trace is attacker-controlled, the investigation becomes execution. This post is the hardening checklist I would run on any engagement where agents read production errors through MCP.

What Agentjacking is (and why it bypasses repo security)

Classic supply-chain attacks compromise code you pull in: a poisoned npm package, a trojanized VS Code extension, a malicious skill file. Agentjacking skips that entirely. The attacker publishes or injects content into a channel your agent already reads: an error reporting endpoint, a log stream, a ticket body, a CI artifact summary. The agent interprets that content as facts about a failure. Embedded instructions in the fake stack trace or error metadata become the next actions.

Tenet's Sentry vector works because many teams expose a DSN in client-side code or public repos. An attacker who knows the DSN can submit crafted error events. When your on-call workflow pipes Sentry issues into a coding agent ("pull the latest critical error and fix it"), the poisoned event arrives looking identical to a real production failure. Approval UX does not help if the agent believes it is remediating an incident.

Agentjacking vs classic agent supply-chain attacks
VectorTouches your git repo?Typical entryWhat the agent trusts
Poisoned npm / extensionOften yesDependency installCode in the tree
Malicious SKILL.mdSometimesSkill loadInstruction file
Agentjacking (observability)NoSentry / logs / ticketsError text as ground truth
Prompt injection in PRYes (content)Diff or commentRepository text

How the Sentry injection attack chain works

The chain Tenet documented is short enough to fit in a standup, which is why it scares me.

  1. 01
    Attacker learns or guesses a public DSN
    Client-side Sentry configs, leaked env files, or public frontend bundles often expose project DSNs. DSNs are not secret keys in the way API keys are, but they are write endpoints for your error stream.
  2. 02
    Attacker submits a crafted error event
    The stack trace and exception message contain shell commands or instructions framed as "remediation steps." Tenet embedded directives that looked like debugging guidance.
  3. 03
    Agent ingests the issue through MCP or API
    Your workflow asks the agent to triage open Sentry issues, reproduce locally, or apply a hotfix. The poisoned issue is indistinguishable from a real one at the text layer.
  4. 04
    Agent executes attacker commands
    With Auto-review, headless mode, or permissive allowlists, shell tools run because the agent believes it is fixing production. Tenet reported high success across Cursor, Claude Code, and Codex in their lab.

Tenet open-sourced agent-jackstop drop-in configs to harden agents against untrusted telemetry. I treat that repo as a starting point, not a substitute for org policy. The configs help; the policy is what survives employee turnover.

Why classifier gates alone do not save you

If you read my Auto-review and pre-push review guide, you know I like classifier gates for throughput. Agentjacking is the counterexample. The requested action looks legitimate in context: curl a diagnostic endpoint, run a cleanup script, fetch a "patch" URL. The classifier sees an agent responding to an incident, not an attacker.

Auto-review reduces prompt fatigue on read-only paths. It is not a guarantee that commands sourced from external error text are safe. Pair it with block_instructions that pause any shell invocation whose arguments came from an unverified observability payload, and with human approval for writes during incident mode.

Surfaces most exposed to Agentjacking

  • MCP servers that wrap Sentry, Datadog, PagerDuty, or Jira issue bodies.
  • Headless triage loops: "every hour, pull critical Sentry issues and propose fixes."
  • Auto-review or auto mode during on-call hours when engineers want speed.
  • Shared DSNs across staging and prod where staging is easier to poison.

The hardening checklist I run after Agentjacking

  1. 01
    Inventory every observability-to-agent path
    List MCP tools, webhooks, and cron jobs where error text enters an agent context. If you cannot draw the path on a whiteboard in two minutes, you do not control it yet.
  2. 02
    Treat observability output as untrusted input
    Same discipline as user-generated content in a RAG corpus. Sanitize, scope, and never pass raw stack traces straight into a shell tool without a human-named approver for execute mode.
  3. 03
    Audit DSN and webhook exposure
    Grep public repos and frontend bundles for Sentry DSNs. Rotate if exposed. Restrict ingest to known environments where the platform allows it.
  4. 04
    Scope MCP read tools narrowly
    An agent that triages production does not need write tools on the same turn. Split read-triage agents from write-fix agents with an explicit handoff and human gate.
  5. 05
    Add block_instructions for incident-sourced commands
    In permissions.json or equivalent, block curl/wget/bash when the parent prompt references external issue IDs unless a human explicitly names the command. Cursor SDK local.autoReview supports this pattern.
  6. 06
    Log and alert on agent shell from triage workflows
    Wire the agent observability stack so any shell invocation during a Sentry-triage session generates a structured event. Review weekly.

Example: split triage from remediation

The pattern I recommend is two agents, not one hero agent. Agent A reads Sentry through MCP and outputs a structured summary: file, line, hypothesis, suggested diff. Agent B only runs after a human approves the summary. Agent A never gets shell write tools.

Triage agent (read-only MCP):
  tools: sentry.list_issues, sentry.get_event, repo.read_file
  output: JSON { hypothesis, proposed_patch, confidence }

Remediation agent (human-gated):
  tools: repo.write, shell.test_only
  input: approved JSON from triage
  rule: no shell args parsed from raw stack trace strings

How Agentjacking fits the broader supply chain

Layer four in my supply chain table was "tool definitions and the data they return." Agentjacking is what happens when you forget that return data is input. A Sentry MCP tool returns text. That text is as executable as a SKILL.md if your agent architecture treats it as instructions.

If you are rolling out MCP Enterprise-Managed Authorization to fix OAuth fatigue, do not stop there. EMA controls who can reach the connector. It does not validate the semantic content of each issue. Content trust is still your problem.

Common mistakes I expect this quarter

  • Assuming repo branch protection saves you. Agentjacking never needed a commit.
  • Giving triage agents the same tool bundle as feature agents because "it is faster."
  • Treating public DSNs as low risk because they are "meant to be client-side."
  • Relying on Auto-review alone during incident response.
  • Skipping red-team exercises on observability feeds because "we only read internal Sentry."

Conclusion

Agentjacking is not theoretical. Tenet demonstrated it across the three coding agents most of my clients run. The fix is not a better model. It is architecture: untrusted input boundaries, split triage and remediation, scoped MCP tools, and explicit blocks on executing commands that originated in error text. Run the checklist once on the Sentry-to-agent path and you will find at least one place where ground truth was assumed. That is the hole.

Sources: Tenet Threat Labs Agentjacking report at https://tenetsecurity.ai/blog/agentjacking-coding-agents-with-fake-sentry-errors/; agent-jackstop configs at https://github.com/tenetsecurity/agent-jackstop.

The weekly take

Agentic AI patterns, delivered Thursdays

What I am shipping, watching, and pruning out of client stacks each week. One email. No fluff.

Shipping an agentic AI project this quarter?
Book a 30-min consult
Frequently asked

Questions readers ask about this post

Share this post
LinkedIn Facebook