<aside> 🎬
Video prompts aren't image prompts with motion. They're a different language — one built around direction, timing, and camera work. This guide breaks down the fundamentals so you can write prompts that produce intentional, high-quality video output.
</aside>
Image prompts describe a still frame. Video prompts describe a scene in motion. That shift changes everything.
When you write a video prompt, you're not painting a picture — you're giving direction. What moves? How does the camera behave? Where does the viewer's eye land? The best video prompts answer these questions clearly and concisely.
<aside> ✅
Do: Camera: Slow dolly in, shallow depth of field. Action: Condensation drips down the glass. Setting: Gin bottle on a marble countertop.
</aside>
<aside> ❌
Don't: "A beautiful, stunning gin bottle on a nice countertop with warm vibes and gorgeous lighting in a fancy bar setting."
</aside>
Same subject, same scene — but the first prompt tells the model exactly what to do. The second leaves everything up to guesswork.
<aside> 🎯
When possible, keep each generation to 1–2 main prompt components. The fewer competing instructions, the cleaner and more predictable the output.
</aside>
A camera pan plus a subject walking plus a background element animating will compete for the model's attention. The result? Muddled, unpredictable output.
This isn't a hard limit — but isolating one dominant motion per clip is recommended for better results. Let each generation do one thing well.
Every strong video prompt is built from layered components. Think of them as building blocks — each one adds clarity without cluttering the others.
| Component | What it does | Example |
|---|---|---|
| Direction | Sets the scene in one clear sentence | "A bottle of gin rotates slowly on a marble countertop, warm afternoon light." |
| Camera | Defines angle, movement, and lens | "Slow 360° orbit, eye-level, 85mm lens." |
| Action | Describes one dominant subject motion | "Condensation drips down the glass surface." |
| Setting | Location and environment | "Dimly lit cocktail bar with warm wood tones." |
| Styling | Theme and overall feel | "Warm golden tones, soft bokeh, gentle lens flare." |
| Effects | Adds lighting, color grade, atmosphere | "Golden hour lighting, soft bokeh background." |
| Models (optional) | Describes human subjects in the scene | "African American female with braids, Swedish male with large build." |
| Audio (optional) | Layers sound design or ambient tone | "Subtle ice clink, ambient bar murmur." |
Effective pairings:
<aside> 💡
Sectioning prompts is key. Writing each component on its own line ensures the model processes instructions one piece at a time — not as a single wall of text.
</aside>
Video Direction +
<Camera> +
<Action> +
<Effects> +
<Audio (Optional)>
Video Direction: [Short description of what you visualize]
Camera: [Capture settings — angle, movement, lens]
Action: [Subject behavior and motion]
Effects: [Lighting, color grade, atmosphere]
Audio (Optional): [Sound design, music, ambient]