Is Cursor Auto-review a security feature?

Cursor describes it as a convenience and throughput feature with a classifier that can make mistakes in both directions. Use it to reduce approval fatigue and steer low-risk work. Do not treat it as a compliance-grade security boundary. Pair it with allowlists for high-risk environments, secret scanning at commit time, and least-privilege MCP scopes.

What is the difference between /review and Bugbot on GitHub?

/review runs Bugbot and Security Review locally before you push. If you then open a PR with the same diff, Bugbot on GitHub or GitLab recognizes the patch ID, skips a duplicate scan, and notes the diff was already reviewed. The goal is one review path, not two redundant ones.

How do I enable Auto-review in headless SDK runs?

Set local.autoReview on Agent.create() or per send() in the TypeScript or Python Cursor SDK. Steer the classifier with permissions.json using autoRun.allow_instructions for call shapes to lean toward allowing and autoRun.block_instructions for patterns that should always pause, such as deletes or credential file paths.

Does pre-push review replace CI security scanning?

No. It front-loads review so agent-generated code is checked before it enters the PR pipeline. CI still matters for integration tests, dependency checks, and environments the local agent cannot see. Think of /review as shifting left, not eliminating CI.

What should I block_instructions first?

Destructive filesystem operations, production database writes, credential and MCP config paths, and any tool call that moves money or sends external communications without a human-named approver. Allow read-only inspections of build artifacts and test output by default.

Does Bugbot replace human code review?

No. It accelerates defect discovery on diffs, especially agent-generated ones. Merge authority, product judgment, and architecture calls stay human. /review shifts review left; it does not eliminate it.

When will Cursor CLI get Auto-review parity?

Cursor forum threads in June 2026 note CLI gaps: unrestricted mode auto-approves everything and allowlist mode is too interactive for CI. SDK local.autoReview is the headless path today; watch Cursor release notes for CLI config parity.

Governing Agent Autonomy: Auto-Review & Pre-Push

In this post (11 sections)

In this post

Introduction

Background agents stopped being a demo sometime in late 2025. By June 2026 they are how a lot of teams ship refactors, tests, and review comments. The bottleneck moved from "can the agent write code" to "can we let it act without approving every shell command." Cursor's answer in June is two-fold: Auto-review as a contextual autonomy dial, and pre-push /review so code quality gates happen before CI.

I map both onto the guardrail box in the anatomy of an AI agent and the broader agentic AI production patterns I teach. Classifier gates sit where actions leave the agent. Bugbot sits where code leaves the laptop. Neither replaces typed tools or an owner who answers when the agent misbehaves at 3 a.m.

What shipped in June 2026 (release overview)

Auto-review is the default run mode for new Cursor users; existing users enable it in Settings > Agents.
Shell, MCP, and Fetch tool calls route through allowlist, sandbox, or classifier in that order.
Cursor SDK (TypeScript and Python) exposes local.autoReview and permissions.json steering for headless runs.
Bugbot on Composer 2.5: ~90 second average review (down from ~5 minutes), ~10% more bugs found, ~22% lower cost.
/review, /review-bugbot, and /review-security run Bugbot locally before push; patch IDs sync with GitHub/GitLab.

Auto-review: autonomy as a dial, not a switch

Cursor's Auto-review blog post frames the problem correctly. Agents should move freely when stakes are low and slow down when an action crosses a meaningful boundary. Auto-review applies to Shell, MCP, and Fetch tool calls. Allowlisted calls run immediately. Sandboxable calls run sandboxed. Everything else goes to a classifier subagent that can allow, suggest a safer path, or surface a standard approval prompt.

Cursor reports roughly 7% of chats in Auto-review mode hit at least one interruption, versus about 40% of actions blocked under some enterprise allowlist setups. That is the throughput win. The caveat, which Cursor states in forum and docs, is that the classifier is non-deterministic. It can miss in both directions. I use Auto-review to reduce prompt fatigue on trusted read paths. I do not use it as the only guardrail before an execute-mode agent touches prod.

How the three-step flow works

Allowlist match: run immediately (good for known read-only inspections).
Sandbox: run with filesystem and network restrictions where the platform supports it.
Classifier: everything else gets contextual review with feedback routed back to the parent agent.

The SDK mirror matters for teams running headless agents. The June SDK release exposes local.autoReview and permissions.json steering via autoRun.allow_instructions and autoRun.block_instructions. That is the same pattern I want in CI: not unrestricted yolo mode, not interactive allowlist hell, but a written policy the classifier can apply. If those headless runs authenticate through a Claude subscription seat, read the June 15 Agent SDK billing checklist before you turn autoReview on in cron. Billing and guardrails changed in the same week.

{
  "autoRun": {
    "allow_instructions": [
      "Read-only inspections of ./dist and test output are fine.",
      "git status, git diff, and ripgrep searches may run without prompt."
    ],
    "block_instructions": [
      "Always pause rm, drop, truncate, and any write under ~/.claude or mcp.json.",
      "Pause outbound curl to non-allowlisted domains."
    ]
  }
}

How Auto-review differs from a global allowlist

Auto-review vs strict allowlist vs unrestricted

Mode	Throughput	Risk profile	Best for
Unrestricted headless	Highest	Highest; no gate on MCP/fetch	Throwaway sandboxes only
Strict allowlist	Low; frequent prompts	Low if list maintained	Regulated laptops with creds
Auto-review classifier	Medium-high; ~7% chats interrupted	Non-deterministic; convenience not compliance	Daily background agent work
Pre-push /review only	N/A at runtime	Catches code issues pre-CI	Pair with any run mode

Pre-push /review: close the loop before CI

The second half of the June tooling wave is when review happens. Cursor's Bugbot June update cut average review time from about five minutes to about ninety seconds on Composer 2.5, with more bugs found per run at lower cost. The workflow change is /review (or /review-bugbot and /review-security) before you open a PR.

If you run /review locally and then push the same diff, Bugbot on GitHub or GitLab recognizes the patch, skips a redundant scan, and leaves a comment noting it already reviewed that diff. That closes a gap I see constantly: the agent wrote the code, nobody reviewed it until CI, and CI only sees what the agent chose to show. Moving review to pre-push makes agent-generated diffs go through the same gate human diffs should.

The workflow I recommend for agent-written PRs

01
Agent completes the task in Cursor
Let Auto-review handle mixed-risk tool calls during the session. Do not disable the gate to "go faster" on credentialed machines.
02
Run /review before git push
Fix Bugbot and Security Review findings in the same session while context is hot.
03
Push and open PR
Bugbot on the host recognizes the patch ID and skips duplicate work.
04
Human owns merge decision
Review gates catch defects; they do not replace ownership. Someone still signs the merge.

Where this sits in the four-part agent anatomy

Auto-review and pre-push review are guardrail-layer tools. They do not replace memory, tools, or the loop. They sit at the boundary where actions leave the agent and touch your systems. I map them onto the anatomy of an AI agent: memory holds context, tools execute, the loop decides, guardrails vet. Classifier gates are guardrails. Bugbot is a guardrail on code quality and security findings before merge.

They also do not fix bad tools. A classifier that approves a call to a tool returning null on failure still loses. The tool contract work in your agents are not broken, your tools are comes first. Governance on top of sloppy tools just governs sloppy outcomes faster.

Agent autonomy patterns in 2026: what each is good for

Pattern	Best for	Not a substitute for
Global allowlist	Strict internal tooling with human in loop	Production execute-mode agents at scale
Auto-review classifier	Long local agent runs with mixed-risk tool calls	Hard security boundary or compliance sign-off
Pre-push /review	Catching agent-generated diffs before CI	Runtime guardrails on live tool execution
Sandbox + API key isolation	Headless CI agents with bounded blast radius	Tool schema design and eval suites

How to adopt Auto-review and /review step by step

01
Turn on Auto-review for local agent work
Settings > Agents for IDE users. For SDK scripts, set local.autoReview and write block_instructions for destructive shell patterns (rm, drop, truncate, credential paths).
02
Add /review to the agent workflow before push
Treat it like lint for agent sessions. If Bugbot flags a security issue, fix before PR, not in review comment thread tennis.
03
Keep secret scanning at the commit boundary
Auto-review does not replace scanning where the agent writes. The supply-chain playbook in your agent's supply chain is the attack surface still applies, and MCP-scoped least privilege is the same story I told in MCP goes stateless: shrink what the agent can reach, then gate what it can do.
04
Measure interruptions, not vibes
Track how often classifiers block high-risk calls you care about vs block benign read paths. Tune allow_instructions until read-only work flows and deletes always pause.

Should you enable Auto-review on production credential machines?

Enable it for throughput on daily agent work if you also run secret scanning at commit time, pin MCP servers, and keep block_instructions aggressive on credential paths. Do not treat the classifier as a compliance control. In high-trust environments I still pair Auto-review with explicit allowlists for production database tools and supply-chain hygiene on extensions and skills.

Common mistakes

Calling Auto-review "zero trust" because marketing language sounded reassuring.
Skipping tool registry hygiene because the classifier "usually catches" bad calls.
Running unrestricted headless SDK agents in CI because CLI parity for Auto-review was still catching up.
Reviewing only in GitHub after the agent already pushed broken security patterns to a branch.
No eval set for agent outputs, so you cannot tell if review gates improved completion rate or just slowed the loop (evals beyond happy path).

What comes after June 2026 in agent governance

Cursor has been explicit that Auto-review is early and focused on local desktop agents today, with the same ideas expected to spread to more surfaces. Watch for CLI parity, tighter MCP-scoped classifiers, and enterprise policy packs you can check into repo. The direction is clear: agents get more autonomy by default, but autonomy becomes policy-as-code rather than a single global toggle.

Best practices for agent autonomy in production

Write block_instructions before allow_instructions so destructive patterns always pause.
Run /review on every agent-generated diff before push, not only on "big" changes.
Keep MCP scopes least-privilege; Auto-review does not fix over-scoped servers.
Log classifier blocks to your observability stack (agent observability stack).
Name an owner for agent workflows who tunes policy when false positives spike.

Conclusion

June 2026 agent tooling finally treats autonomy as contextual. That is the right abstraction. Auto-review and pre-push review are worth adopting this week if you run background agents daily. They are not worth treating as the whole governance story. Pair them with typed tools, observable runs, and an owner who answers when the agent misbehaves at 3 a.m.

Governing agent autonomy in 2026: Auto-review, pre-push review, and why approval prompts are not a security model

Introduction

What shipped in June 2026 (release overview)

Auto-review: autonomy as a dial, not a switch

How the three-step flow works

How Auto-review differs from a global allowlist

Pre-push /review: close the loop before CI

The workflow I recommend for agent-written PRs

Where this sits in the four-part agent anatomy

How to adopt Auto-review and /review step by step

Should you enable Auto-review on production credential machines?

Common mistakes

What comes after June 2026 in agent governance

Best practices for agent autonomy in production

Conclusion

Agentic AI patterns, delivered Thursdays

Questions readers ask about this post

Read next

Governing agent autonomy in 2026: Auto-review, pre-push review, and why approval prompts are not a security model

Introduction

What shipped in June 2026 (release overview)

Auto-review: autonomy as a dial, not a switch

How the three-step flow works

How Auto-review differs from a global allowlist

Pre-push /review: close the loop before CI

The workflow I recommend for agent-written PRs

Where this sits in the four-part agent anatomy

How to adopt Auto-review and /review step by step

Should you enable Auto-review on production credential machines?

Common mistakes

What comes after June 2026 in agent governance

Best practices for agent autonomy in production

Conclusion

Agentic AI patterns, delivered Thursdays

Questions readers ask about this post

Read next

Claude Code Artifacts turn terminal output into live review pages: what Team and Enterprise buyers should pilot first

Agentjacking is real: poisoned Sentry errors can hijack Cursor, Claude Code, and Codex without touching your repo

The June 15 Claude billing change: Agent SDK credits, model retirement, and the checklist I run before anything breaks