Text to Video API
Describe It. Generate It.

Transform natural language descriptions into cinema-quality video clips. One API call. No post-production. Powered by Seedance 2.0.

Start Generating →
AI illustration of a panda blowing dandelion seeds into animated scenes

How Text-to-Video Generation Works

Text-to-video generation uses a diffusion-based AI model to convert written descriptions into video frames. You provide a text prompt describing the scene, camera angle, lighting, and action. The model interprets your description and synthesizes a coherent video clip with realistic motion, physics, and temporal consistency.

Seedance 2.0 excels at understanding complex prompts with multiple elements: camera movements (dolly, pan, zoom), environmental conditions (rain, fog, golden hour), and subject actions (walking, pouring, rotating). The model produces output at up to 2K resolution with up to 10 seconds of fluid motion per clip.

Through US Video API, you access Seedance 2.0 via a simple REST endpoint. No ML infrastructure, no model hosting, no GPU management. Just HTTP requests and video files.

Code Examples

Python

Python
import requests, time

# Step 1: Submit generation job
job = requests.post(
    "https://usvideoapi.com/v1/videos",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "prompt": "Aerial drone shot of a coastal highway at sunset, "
                 "waves crashing against cliffs, golden light",
        "resolution": "1080p",
        "duration": 5,
    }
).json()

# Step 2: Poll for completion
while job["status"] == "pending":
    time.sleep(5)
    job = requests.get(
        f"https://usvideoapi.com/v1/videos/{job['id']}",
        headers={"Authorization": f"Bearer {API_KEY}"}
    ).json()

# Step 3: Download video
print(job["video_url"])  # Direct MP4 link

cURL

Bash
curl -X POST https://usvideoapi.com/v1/videos \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Close-up of coffee being poured into a ceramic mug, steam rising, soft morning light",
    "resolution": "720p",
    "duration": 5
  }'

# Response: {"id": "job_a1b2c3", "status": "pending", "price": "$1.25"}

Node.js

JavaScript
const response = await fetch("https://usvideoapi.com/v1/videos", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    prompt: "Timelapse of a flower blooming, macro lens, studio lighting",
    resolution: "720p",
    duration: 5,
  }),
});

const job = await response.json();
console.log(job.id); // "job_x7y8z9"

Prompt Engineering Tips

The quality of your text-to-video output depends heavily on how you write your prompts. Here are practical tips for getting the best results from Seedance 2.0:

Be specific about camera movement

Instead of "a city at night," write "slow dolly forward through a neon-lit Tokyo alley at night, rain-slicked streets reflecting lights." Seedance 2.0 responds well to cinematographic direction: dolly, pan, tilt, crane, tracking shot, static wide, handheld.

Describe lighting explicitly

Lighting makes or breaks a shot. Specify: "golden hour backlight," "overcast diffused light," "dramatic side-lit chiaroscuro," or "warm tungsten interior." The model uses lighting cues to set mood and realism.

Include motion descriptions

Static scenes work, but motion sells. Add action: "steam rises from the mug," "hair blows in the wind," "leaves fall in slow motion." Describe the subject's movement and the camera's movement separately for best results.

Use negative prompts sparingly

If you want to avoid certain artifacts, you can include a negative_prompt parameter: "blurry, distorted, watermark." But most of the time, a well-written positive prompt produces clean output without needing negation.

Resolution vs. speed tradeoff

For rapid iteration and prompt testing, use 480p ($0.10/sec, under 30s generation). When you have a prompt you are happy with, re-generate at 1080p ($0.50/sec) for production quality. This workflow keeps costs low during development.

Pricing for Text-to-Video

Text-to-video generation is billed per second of output at your chosen resolution:

No subscriptions. No minimums. Prepaid balance — add funds and start generating. Volume discounts available for accounts spending $500+/month.

Quality Showcase

Seedance 2.0 is ByteDance's flagship video generation model. It excels at:

Visit our homepage demo section to see real API output — unedited, no cherry-picking.

AI illustration of a sleeping cat with dreams rising from books

Turn your words into
cinematic video

Register, add funds, and generate your first text-to-video clip in under 60 seconds.

Get Your API Key →
E

Written by Eric J.

UT Austin McCombs MIS alumnus. AI video researcher with a deep appreciation for music, visual art, and the intersection of technology and creative expression.