All posts
Guides

Remotion + Claude Code: Generate Product Demo Videos from a Prompt (2026)

How to scaffold a Remotion project, build a composition from a brief, add captions, and render — using VideoKit's 17 commands and 5 skills.

Updated 13 min read
Remotion + Claude Code: Generate Product Demo Videos from a Prompt (2026)

You can turn a one-line brief into a rendered product demo video with Claude Code and Remotion in four moves: scaffold a Remotion project, generate a composition from the brief, add captions from an SRT or transcript, and render locally or on Lambda. Because the video is React, every variant is a parameter and every render is reproducible — which makes this one of the few creative workflows where Claude Code adds compounding value instead of one-off help. VideoKit (17 commands, 5 skills, 3 read-only agents, 12,602 tokens) covers the whole pipeline. This guide walks every step.

Remotion is also the least-covered framework in the Claude Code skill ecosystem — no major free repo goes beyond Remotion's own rule files. That gap is partly why we built VideoKit, but the four-step workflow below works whether you use our kit or wire your own skills.

Why does Remotion fit Claude Code better than traditional video editors?

Traditional video editing is a timeline you drag around by hand. That is a terrible interface for an AI agent, which cannot see your timeline and cannot express "move the caption 200ms earlier" as code it can verify. Remotion flips the model: a video is a React component tree, with <Composition>, <Sequence>, and <Series> defining what appears when.

That means:

  1. The video is code. Claude can write it, read it, diff it, and refactor it like any component.
  2. Every render is reproducible. Same props, same output, every time — no timeline drift.
  3. Every variant is a parameter. Change a prop, get a different video. This is what makes batch and data-driven video feasible.
  4. The QA surface is clear. You can assert frame counts, prop values, and composition dimensions — things you can test.

So "make me a demo video" becomes "write me a <Composition>," which is squarely in Claude Code's wheelhouse. The four steps below map directly onto that.

Traditional EditorRemotion + Claude Code
Timeline drag-and-dropComponent tree in TypeScript
One-off exportReproducible render from props
Manual variant creationBatch over a CSV/JSON matrix
No automated QAAssert frame counts, dimensions, timing
Hard to diff/versionFull git diff on every change

What is in VideoKit and what does it cost?

VideoKit ships with 17 commands, 5 skills, 3 read-only specialist agents, and a measured token budget of 12,602 tokens — the smallest footprint of the five kits. The flagship command is /video clone: give it a reference video and it recreates the style as a Remotion composition, then verifies the match. Other daily commands:

  • /video make — full composition from a written brief
  • /video demo — product-demo-specific scaffold with screen recording support
  • /video caption — captions from audio or an existing SRT
  • /video data — data-viz compositions (bar races, counters, chart animations)
  • /video social — aspect-ratio cutdowns for every platform (9:16, 1:1, 4:5)

The three read-only agents are director (parses brief, sets format/duration/tone), QA (renders sample frames, scores against storyboard and brand), and brand-keeper (enforces colors, fonts, motion rules on every component). Agents in v2 are read-only specialists — they audit, verify, and report findings. They do not block a pipeline waiting for approval; the command ends with a rendered output or a concrete error, not a reviewer handshake.

Pricing for a single kit is $14.99/month or $99 one-time (perpetual license for the version shipped — no future updates on lifetime). The All-Access plan at $49.99/month covers all five kits. All plans come with a 14-day refund window and 3-device coverage.

What is the content-provenance rule and why does it matter?

Before any asset enters a VideoKit project, provenance matters. The kit only handles user-owned, royalty-free, or original Remotion compositions. It never processes copyrighted source video. The video-ingest skill validates provenance before touching a file: it checks license metadata, asks you to attest ownership or paste a royalty-free license URL, and logs that attestation into the project manifest. If provenance cannot be established for an imported clip, image, font, or audio track, the skill halts and returns a provenance error instead of rendering.

Build your own pipeline the same way. Generate or own everything you render. Refuse anything you cannot license. This matters more now that AI-generated content is showing up in commercial products at scale — a provenance log is cheap insurance.

Step 1: How do you scaffold a Remotion project with Claude Code?

The first move is standing up a Remotion project: Root.tsx, registerRoot, tsconfig, and the config file. In VideoKit this is handled by /video make or /video demo, which call the video-scaffold skill (roughly 1,200 tokens) as the first step. The director agent parses your brief and sets format, duration, and tone before any code is written.

# What the scaffold produces (Remotion's standard shape)
src/
├── Root.tsx          # registers compositions
├── index.ts          # registerRoot(Root)
└── compositions/
    └── Demo.tsx      # your first composition
remotion.config.ts    # concurrency, codec, image format

A brand pass runs alongside: a BRAND.md and MOTION.md are written from your brief (colors, fonts, logo, motion rules), and from that point on the brand-keeper agent enforces them on every component the command generates. This is the difference between "an AI made a video" and "an AI made a video that looks like your product."

Install VideoKit with:

ck auth <your-key>
ck install video
# Token ledger prints: 12,602 tokens across 17 commands + 5 skills

Or via the plugin marketplace: /plugin marketplace add Madni-Aghadi/claudekit-video.

Step 2: How does Claude Code build a Remotion composition from a brief?

Once the project is scaffolded, the brief becomes a real <Composition>. The internal flow is: director agent parses brief → scriptwriter hits runtime to the second → storyboard turns script into frame-by-frame beats → composer builds the actual .tsx.

A registered Remotion composition is just a React component plus its metadata:

// Demo.tsx — what the compose step produces from the storyboard
import { Composition } from "remotion";
import { DemoVideo } from "./DemoVideo";
 
export const RemotionRoot = () => (
  <Composition
    id="ProductDemo"
    component={DemoVideo}
    durationInFrames={900}   // 30s at 30fps
    fps={30}
    width={1920}
    height={1080}
    defaultProps={{ headline: "Ship faster", accent: "#8b5cf6" }}
  />
);

For motion, spring physics and easing are tuned so entrances feel natural rather than robotic. The QA agent then renders sample frames at key beats — title card, main demo, CTA — and scores them against the storyboard and brand spec. If the score misses on a frame, it loops with the composer (max 3 iterations). Commands end with a verified, rendered output — not a reviewer waiting for your approval.

The /video clone command takes a slightly different path: it analyzes the reference video's visual rhythm, color palette, font choices, and transition timing, then writes a Remotion composition that matches the style. The QA agent overlays frames from both to score the match. We built this because the most common brief is "make it look like this."

Step 3: How do you add captions to a Remotion video?

If your demo has narration, captions come next. VideoKit's /video caption command handles two paths, both provenance-gated first:

  1. From audio: The video-whisper skill produces a word-level SRT and JSON from your owned or royalty-free audio file.
  2. From an existing SRT: The video-srt skill parses and normalizes it into timed caption data.

Then video-caption-style burns styled, positioned, animated captions into the composition. An optional word-level karaoke mode highlights each word as it is spoken — synced from Whisper timestamps — for the short-form look that performs on social.

# Caption pipeline (conceptual)
# 1. Provenance check
video-ingest audio.wav           # owned/royalty-free? attestation logged
 
# 2. Transcribe
video-whisper audio.wav -> caps.srt   # word-level SRT + JSON
 
# 3. Style and burn
video-caption-style caps.srt -> <Captions/>   # brand timing from MOTION.md
 
# 4. Export sidecar
video-srt-export caps.vtt        # accessibility sidecar

The QA agent runs after caption styling, scoring legibility, sync accuracy, and safe-zone placement. Captions that bleed outside the 80% safe zone on a 9:16 cut fail the check automatically.

Step 4: How do you render a Remotion video with Claude Code?

The final step is the render. VideoKit's /video render (or the render phase embedded in /video make) picks the target, but only after a pre-render gate validates timing and checks sample frames.

Three render modes are available:

  1. Localnpx remotion render with your chosen codec, CRF, and frame range. Good for a single video; no cloud setup needed.
  2. Lambda — deploys to @remotion/lambda for cloud fan-out. Better for long compositions or when you need fast turnaround.
  3. Batch — fans one composition over a CSV or JSON matrix, rendering one variant per row. This is how you get 50 personalized videos from one template.
# Single local render
npx remotion render ProductDemo out/demo.mp4 --codec=h264
 
# Or batch over a dataset (one row -> one rendered variant)
# VideoKit sets this up via /video data with a matrix prop

Then platform specs apply per-target codecs and bitrates, and thumbnail frames are emitted as poster images.

A real 30-second launch explainer — 16:9 master plus a vertical cutdown — runs roughly 25 minutes wall-clock and about 46k tokens for the full build, per our measured estimates. Every VideoKit skill publishes its token cost at install time. The methodology behind those figures is in our token-cost dataset.

Render ModeBest ForApprox. Wall-Clock
LocalSingle video, iteration5-15 min
LambdaLong videos, fast turnaround2-8 min
Batch (Lambda)10-200+ variants from one template10-30 min

What can you build beyond a single demo video?

The four-step flow pays for itself once the composition exists, because a Remotion composition is parameterized by design.

Repurpose to every platform. The /video social command runs a responsive layout pass, then generates 9:16, 1:1, and 16:9 variants in parallel. Per-platform CTAs are added automatically, and the QA agent checks that nothing focal gets cropped outside the safe zone. One composition, every platform, one command.

Fan out over a dataset. Point the batch renderer at a CSV and each row renders its own variant. A 50-customer year-in-review from one 20-second template. A 200-variant product demo personalized by industry. The creative and QA work happens once; each additional render is a prop change. We documented a 200-render data-driven campaign in detail in data-driven video: 200 Remotion renders.

Data visualization compositions. The /video data command builds animated charts — bar races, counter stats, line progressions — from a data file. The numbers become the animation, which keeps the composition honest and the renders reproducible. This is useful for quarterly wrap-up videos, benchmark comparisons, and product metric announcements.

This is the compounding benefit of "every variant is a parameter": the creative investment scales into a fleet. A timeline editor cannot do this at any price.

How does VideoKit compare to building the workflow from scratch?

The honest answer is that the raw Remotion + Claude Code workflow works fine without a kit. Remotion ships its own rule files, and Claude Code can scaffold a project from first principles. The tradeoff is time and consistency.

VideoKit's 5 skills encode the decisions that take time to get right: spring physics presets that feel natural, caption safe-zone rules per aspect ratio, Lambda concurrency caps that keep costs predictable, and a provenance-check step that runs before every asset ingestion. You could write all of that yourself. Most people don't, or they write it once and it drifts.

The token cost comparison is also real. We measured the free alternatives in real cost of free skills: skills without a knowledge boundary load context indiscriminately and routinely spend 3-5x the tokens on equivalent tasks. VideoKit's 12,602-token budget is the measured install size — you know the footprint before you commit.

ApproachSetup TimeToken PredictabilityProvenance PolicyBatch Support
Raw Remotion + Claude2-4 hoursUnpredictableDIYDIY
Free skill repos30-60 minVariableNoneRare
VideoKit5 min (ck install video)Measured (12,602 tokens)Built-inYes (/video data)

FAQ

Can Claude Code generate a Remotion video from scratch?

Yes. Because Remotion videos are React components, Claude Code can scaffold the project, write the <Composition> from a brief, add captions from an SRT or transcript, and run the render — locally, on Lambda, or as a batch over a dataset. Every render is reproducible and every variant is a prop change, which makes video a better fit for an agent workflow than most creative tasks.

How do you add captions to a Remotion composition with Claude Code?

Two paths: transcribe owned or royalty-free audio to a word-level SRT using Whisper, or import an existing SRT or VTT file. A styling step then burns positioned, animated captions into the composition — with optional word-level karaoke synced to transcript timestamps — and exports a sidecar subtitle file for accessibility. Provenance on the source audio should be validated before transcription.

Does Claude Code handle copyrighted video?

A well-built pipeline refuses to. VideoKit only handles user-owned, royalty-free, or original Remotion compositions. Every ingestion step validates provenance, logs an ownership attestation or royalty-free license URL, and halts with a provenance error if it cannot establish ownership. If you build your own pipeline, adopt the same rule — generate or own everything you render.

How long does it take to generate a product demo video?

For a 30-second explainer with a vertical cutdown, our measured estimate is roughly 25 minutes wall-clock and about 46k tokens for the full build, including QA iterations. Rendering time on top of that depends on length, resolution, and whether you render locally or fan out on Lambda. Batch renders amortize the build cost across every variant.

What is /video clone and when should I use it?

/video clone is VideoKit's flagship command. You give it a reference video and it recreates the visual style as a Remotion composition — color palette, font choices, motion timing, transition style. The QA agent overlays frames from both videos to score the match. Use it when the brief is "make it look like this" rather than describing the style from scratch.

How does VideoKit's batch rendering work?

The /video data command takes a data file (CSV or JSON) where each row defines one rendered variant. The composition's props are bound to the data columns, and the batch renderer fans out across every row — producing one output file per row. Lambda concurrency is capped automatically to keep costs predictable. We tested this at 200 renders in data-driven video: 200 Remotion renders.


If you want to skip the setup work and start with a production-ready Remotion pipeline — provenance checking, brand enforcement, QA scoring, and batch rendering already wired — VideoKit is the fastest path. It installs in under five minutes (ck install video), prints its token budget on install, and ships with the /video clone flagship for when you need to match an existing style rather than describe one from scratch.

Give Claude Code a real team

Five kits, 101 commands, every token measured. Pick the team that matches your work and install it in five minutes.

See the kits

Keep reading