What is Codex Record and Replay?

A macOS feature in Codex app 26.616 that watches you perform a workflow, then packages the demonstration into a reusable Computer Use skill you can replay in new threads with different inputs such as expense amounts or ticket titles.

Do I need Computer Use enabled for Record and Replay?

Yes. Computer Use must be enabled. Base Computer Use expanded to more regions on June 16, 2026, but Record and Replay initially excluded the EEA, UK, and Switzerland even where base Computer Use is available.

Can I run recorded skills unattended?

Only after inspection and sandbox replay. Treat generated skills like unsigned executable instruction. Review for hardcoded secrets and URLs, parameterize inputs, test in non-production accounts, and add monitoring before schedules.

How is Record and Replay different from Cursor /automate?

Record and Replay captures macOS UI demonstrations into Codex Computer Use skills. Cursor /automate configures cloud automations from plain language with GitHub triggers. Both need guardrails; the skill file review step is explicit in Codex replay.

What should I check in a generated skill file?

Hardcoded credentials, unintended keystrokes from the demo, non-parameterized values, destructive UI paths, and URLs pointing to non-production systems. Replay once in sandbox before promoting to scheduled runs.

Does Record and Replay replace hand-written MCP tools?

No. UI automation skills complement typed MCP tools for systems without APIs. Prefer MCP tools for production integrations where schemas, auth, and audit logs matter.

Codex Record & Replay: Computer Use Skills Guide

In this post (7 sections)

In this post

Introduction

The gap in most Computer Use rollouts is not "can the model click the right button." It is "can we encode the workflow without a week of prompt engineering." Record and Replay is OpenAI's answer: show once, replay with variables. I recorded an expense filing flow and a Jira ticket creation flow the week 26.616 shipped. The throughput win is real. So is the supply-chain risk if you trust the generated skill without reading it.

This post sits next to agent supply chain security and governing agent autonomy. Record and Replay creates skills automatically. Skills are executable instruction. That is the same trust boundary as loading an unsigned SKILL.md from the internet.

What Record and Replay does (release overview)

Record a macOS workflow while Codex observes screen and input events.
Package the demo into a Computer Use skill reusable across threads.
Replay with different inputs (amounts, ticket titles, report dates).
Thread handoff between local and remote Codex hosts in the same release.
Bulk actions on automation run history for ops at scale.
Requires Computer Use enabled; EEA/UK/CH excluded initially for Record and Replay.

Record and Replay vs hand-written skills

When to record vs when to write skills by hand

Scenario	Record and Replay	Hand-written skill
Repeating admin UI workflow with stable layout	Strong fit	Overkill unless compliance demands review
Production deploy or infra changes	Do not record blindly	Typed tools and CI scripts instead
Workflow with sensitive credentials on screen	Never record raw	Redact and use vault-injected env vars
Cross-app orchestration with branching logic	Record baseline, then edit skill	Plan branches explicitly in SKILL.md
Regulated audit trail required	Record plus mandatory human review gate	Signed internal skill registry

The inspection checklist before unattended replay

01
Read the generated skill file end to end
Look for hardcoded URLs, account names, and accidental keystrokes you did not intend to teach. Recording captures what you did, including mistakes you corrected mid-demo.
02
Replay once in a sandbox account
Never first-run against prod finance or HR systems. I use the same sandbox discipline as Agentjacking triage splits: read-only or fake data first.
03
Parameterize inputs explicitly
Expense amount, vendor name, ticket priority should be skill inputs, not buried in prose the model might mis-parse.
04
Pair with scheduled monitoring, not blind cron
OpenAI's scheduled monitoring tasks fit condition-driven reruns. Combine "replay skill" with "alert if UI changed" before daily unattended execution.
05
Log every Computer Use run
Screenshots and video from automation history are debugging aids, not audit logs. Export structured events to your observability stack.

How Record and Replay fits multi-vendor agent stacks

Many teams run Codex beside Cursor and Claude Code. Record and Replay is Codex-specific, but the skill trust model is not. Apply the same provenance rules I use for NVIDIA Verified Agent Skills: no unsigned skills in production paths, internal registry for anything unattended.

If the recorded workflow touches MCP connectors, remember MCP EMA governs who reaches the connector, not what the skill does with it after login.

Common mistakes with recorded Computer Use skills

Scheduling replay daily without detecting UI layout changes.
Recording workflows that flash secrets or customer PII on screen.
Assuming regional Computer Use availability matches Record and Replay availability.
Skipping skill file review because "OpenAI generated it."
No rollback when replay clicks the wrong destructive button.

Conclusion

Record and Replay is the fastest path I have seen from demo to repeatable Computer Use skill. It is also a new supply-chain surface. Record in sandbox, inspect the skill like code review, replay with parameters, then schedule with monitoring. Skip inspection and you did not automate the workflow. You automated whatever the model remembered from one noisy demo.

Sources: OpenAI Codex Record and Replay documentation at https://developers.openai.com/codex/record-and-replay; Codex app 26.616 release notes.

Codex Record and Replay turns one demo into a Computer Use skill: how I inspect generated skills before trusting them unattended

Introduction

What Record and Replay does (release overview)

Record and Replay vs hand-written skills

The inspection checklist before unattended replay

How Record and Replay fits multi-vendor agent stacks

Common mistakes with recorded Computer Use skills

Conclusion

Agentic AI patterns, delivered Thursdays

Questions readers ask about this post

Read next

Codex Record and Replay turns one demo into a Computer Use skill: how I inspect generated skills before trusting them unattended

Introduction

What Record and Replay does (release overview)

Record and Replay vs hand-written skills

The inspection checklist before unattended replay

How Record and Replay fits multi-vendor agent stacks

Common mistakes with recorded Computer Use skills

Conclusion

Agentic AI patterns, delivered Thursdays

Questions readers ask about this post

Read next

Cursor cloud subagents in 2026: /in-cloud, /babysit, and /automate without losing your local guardrails

Claude Fable 5 for agent builders: when the frontier model is worth the routing change

Agentic RAG vs vanilla RAG: why a Sufficient Context Agent beats retrieve-then-pray