Demos written by your AI assistant.
Hand your agent the CaptureBeam skill bundle and an API key. It knows how to read the schema, scan a URL for real ARIA targets, author a YAML, render it, and repair one failed or rough segment without redrafting the whole demo. Works with Claude Code Skills, Anthropic / OpenAI agent loops, and any generic agent runtime that supports tool calls.
In an agent IDE? If your client speaks MCP (Claude Code, Cursor, Codex, Claude Desktop, Windsurf), the MCP server is the lower-friction path — no skill files to download, no raw HTTP. The curl-level loop below is for everything else.
Get the skill bundle
A SKILL.md, a system prompt, four worked examples, and a README. All publicly downloadable.
The skill itself. Drop into ~/.claude/skills/capturebeam/SKILL.md for Claude Code, or use as context in any agent runtime.
Open / downloadDrop-in system prompt for OpenAI / Anthropic / generic agent loops. Pass as the system message.
Open / downloadThree steps, no styling — the bare-minimum shape.
Open / downloadA typical 7-step welcome flow. Captions, presets, cameraControl.
Open / downloadA 9:16 vertical clip with hover, cameraControl, and waitBeforeMs. Social-ready.
Open / downloadA 1:1 square clip exercising a single Storybook component.
Open / downloadHow to install the bundle.
Open / downloadmkdir -p ~/.claude/skills/capturebeam/examples
cd ~/.claude/skills/capturebeam
curl -L https://capturebeam.com/agents/SKILL.md -o SKILL.md
curl -L https://capturebeam.com/agents/system-prompt.md -o system-prompt.md
for n in 01-minimal 02-onboarding 03-feature-tour 04-storybook-component; do
curl -L "https://capturebeam.com/agents/examples/$n.yaml" \
-o "examples/$n.yaml"
doneAfter install, ask Claude Code “render an onboarding demo for https://your-app.com” and the skill kicks in automatically.
The agent loop
- Read the schema. Fetch
GET /api/v1/schemafor the JSON Schema and quick reference. The response includes adocsVersionfield — re-fetch SKILL.md when it's higher than the cached version. - Scan the URL.
POST /api/v1/probewith{ url }. The response lists every interactive element (role, name, label, placeholder, testId). - Draft the YAML. Use NL targets like
{ role: "button", name: "Sign in" }— never CSS selectors. Keep demos short: 5–12 steps, ~30 seconds. Public step types aregoto,click,type,scroll,wait,hover,highlight, andcameraControl. Captions are a per-step field. - Render. POST to
/api/v1/renderswith raw YAML or a project ID. Poll/api/v1/renders/{id}every 2-3 seconds. The response includesprogress({ current, total, stepType }) for live feedback. - Self-correct. On
failed, re-scan, identify the broken step from the per-step trace, patch only that step or its immediate timing/camera context, then retry. The per-stepstepsarray tells you which step failed, what target was attempted, and any recovery hint. - Surface the edit URL. Every render response includes
editUrlpointing at the dashboard project — give it to the user so they can tweak and re-render.
Drop-in system prompt
Paste this into your agent's system prompt — or pull the current version directly from /agents/system-prompt.md.
You can produce demo videos using CaptureBeam.
Endpoints (Bearer auth with $CAPTUREBEAM_KEY):
GET https://capturebeam.com/api/v1/schema JSON Schema + quickRef + docsVersion
POST https://capturebeam.com/api/v1/probe { url } -> scan the page; returns interactive elements
POST https://capturebeam.com/api/v1/renders { yaml } or { projectId } -> { id, editUrl, ... }
GET https://capturebeam.com/api/v1/renders/{id} status, progress, steps, videoUrl, editUrl, ...
POST https://capturebeam.com/api/v1/renders/{id}/share { permission, password?, expiresInHours?, cloneable? } -> { shareUrl, ... }
Authoring loop:
1. Read /api/v1/schema once. Cache the docsVersion; re-fetch SKILL.md
when a render response carries a higher docsVersion.
2. Scan the URL the user wants to demo. Use the response to write
valid targets — { role: "button", name: "Sign in" } over CSS.
3. Draft a short YAML (5-12 steps, ~30 seconds). One narrative beat
per step. Add caption blocks where they help.
Step types: goto, click, type, scroll, wait, hover, highlight,
cameraControl. Captions are a per-step field, not type: caption.
Never emit pause, keyPress, dragTo, underline, cameraPan,
assertVisible, or assertHidden.
4. POST the YAML to /api/v1/renders. Poll every 2-3s; surface
progress.current / progress.total to the user while running.
5. On 'failed', inspect steps[] for the first failed item. Use index,
type, error, resolvedSelector, and recovery; re-scan; patch only
that step or its nearby timing/camera context; retry.
Always surface editUrl in your final message so the user can tweak the
demo and re-render from the dashboard. When the user wants a stable public
watch link, call /api/v1/renders/{id}/share after the render succeeds and
surface shareUrl. For password-protected shares, never include the password
in the public link; tell the user to send it separately.
Render config (under `render`):
quality: 1080p (default) | 1440p | 4k (3-5× slower)
aspect: 16:9 (default) | 9:16 (vertical / social) | 1:1 (square)
preset: midnight (default) | iris | mint | sunset | paper
speed: slow (0.5×) | normal (default) | fast (2×) | very-fast (5×)
Scales every typing delay, settle wait, and camera
transition. Cursor speed comes along automatically.
autoZoom: true (default) — auto-zoom click/type/hover/highlight/
cameraControl based on the resolved target rect.
Set false for pixel-stable framing throughout the demo.
Per-step cameraZoom: number 1–10 to override the auto-zoom on that
step. Centered on the step's target if any, else screen center.
cameraZoom always wins, even when autoZoom is false.Complete worked example
From scratch — one shell script that drafts, renders, and prints the video URL. Replace $CAPTUREBEAM_KEY and the YAML body.
#!/usr/bin/env bash
set -euo pipefail
KEY=${CAPTUREBEAM_KEY:?}
BASE=${CAPTUREBEAM_BASE:-https://capturebeam.com}
YAML='title: Sign-up tour
subtitle: New user from scratch
render:
quality: "1080p"
aspect: "16:9"
preset: midnight
speed: normal
autoZoom: true
steps:
- type: goto
url: https://app.example.com/signup
- type: wait
networkIdle: true
ms: 1500
- type: type
target: { role: "textbox", placeholder: "Email" }
text: "[email protected]"
caption: { text: "Enter your email" }
- type: type
target: { role: "textbox", placeholder: "Password" }
text: "supersecret"
- type: highlight
target: { role: "button", name: "Create account" }
durationMs: 700
- type: click
target: { role: "button", name: "Create account" }
caption: { text: "And you are in." }'
JOB=$(curl -sS -X POST $BASE/api/v1/renders \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d "$(jq -nc --arg y "$YAML" '{yaml: $y}')" | jq -r .id)
echo "Render queued: $JOB"
while :; do
R=$(curl -sS $BASE/api/v1/renders/$JOB -H "Authorization: Bearer $KEY")
STATUS=$(echo "$R" | jq -r .status)
case "$STATUS" in
succeeded) echo "$R" | jq -r .videoUrl; break ;;
failed) echo "$R" | jq -r .error >&2; exit 1 ;;
*) sleep 3 ;;
esac
doneCommon patterns
Self-correcting agent
When a render fails on a missing target, the agent should scan the URL, find the closest matching element, and retry. The poll response includes a per-step steps array — index, type, status (ok/skipped/failed), optional error, and the resolved selector/recovery hint that was tried. That's enough for the agent to fix exactly the broken step instead of redrafting the whole YAML. If the video succeeds but one beat is weak, patch only that beat's timing, caption, or camera fields and re-render.
# Pseudo-code for a self-correcting agent
yaml = author_yaml_from_repo()
for attempt in range(3):
job = post_renders(yaml)
result = poll(job) # blocks until done
if result.status == "succeeded" \
and not any(s.status == "failed" for s in result.steps or []):
return result.videoUrl
# Find the first failed step and ask: "what did the runner try?"
first_fail = next(s for s in result.steps if s.status == "failed")
print("step", first_fail.index, "failed:", first_fail.error)
print("selector:", first_fail.resolvedSelector)
print("recovery:", first_fail.recovery)
# Scan the page that step ran on, find a closest match by name.
page = post_probe(deployed_url)
fixed_target = best_match(first_fail, page.elements)
# Preserve working steps; patch only the failed segment.
yaml = patch_step(yaml, first_fail.index, fixed_target)Per-PR demo bot
On every PR that touches a UI route, generate a demo video and comment with the embed. The agent reads the diff, identifies the changed page, and renders a flow that exercises the new code.
Onboarding videos from docs
For every "Getting started" markdown page in your docs, render a 30s walkthrough that demonstrates the steps. Re-render the whole set on a nightly cron — broken videos surface UI changes before users do.
API surface used by agents
GET /api/v1/schema— JSON Schema + quick reference. No auth required.POST /api/v1/probe— discover elements at a URL. Bearer auth.POST /api/v1/renders— submit a render. Bearer + active subscription.GET /api/v1/renders/{id}— poll a render. Bearer auth.
Full reference + status codes: /docs/api.
Limits
- Concurrency: 3 renders in flight per account. The 4th returns 429 — agents should back off and retry.
- Scan: Bearer-authed but not subscription-gated, so an agent can scan a page before the user has subscribed.