Transform natural language descriptions into cinema-quality video clips. One API call. No post-production. Powered by Seedance 2.0.
Start Generating →
Text-to-video generation uses a diffusion-based AI model to convert written descriptions into video frames. You provide a text prompt describing the scene, camera angle, lighting, and action. The model interprets your description and synthesizes a coherent video clip with realistic motion, physics, and temporal consistency.
Seedance 2.0 excels at understanding complex prompts with multiple elements: camera movements (dolly, pan, zoom), environmental conditions (rain, fog, golden hour), and subject actions (walking, pouring, rotating). The model produces output at up to 2K resolution with up to 10 seconds of fluid motion per clip.
Through US Video API, you access Seedance 2.0 via a simple REST endpoint. No ML infrastructure, no model hosting, no GPU management. Just HTTP requests and video files.
import requests, time # Step 1: Submit generation job job = requests.post( "https://usvideoapi.com/v1/videos", headers={"Authorization": f"Bearer {API_KEY}"}, json={ "prompt": "Aerial drone shot of a coastal highway at sunset, " "waves crashing against cliffs, golden light", "resolution": "1080p", "duration": 5, } ).json() # Step 2: Poll for completion while job["status"] == "pending": time.sleep(5) job = requests.get( f"https://usvideoapi.com/v1/videos/{job['id']}", headers={"Authorization": f"Bearer {API_KEY}"} ).json() # Step 3: Download video print(job["video_url"]) # Direct MP4 link
curl -X POST https://usvideoapi.com/v1/videos \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "Close-up of coffee being poured into a ceramic mug, steam rising, soft morning light", "resolution": "720p", "duration": 5 }' # Response: {"id": "job_a1b2c3", "status": "pending", "price": "$1.25"}
const response = await fetch("https://usvideoapi.com/v1/videos", { method: "POST", headers: { "Authorization": `Bearer ${API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ prompt: "Timelapse of a flower blooming, macro lens, studio lighting", resolution: "720p", duration: 5, }), }); const job = await response.json(); console.log(job.id); // "job_x7y8z9"
The quality of your text-to-video output depends heavily on how you write your prompts. Here are practical tips for getting the best results from Seedance 2.0:
Instead of "a city at night," write "slow dolly forward through a neon-lit Tokyo alley at night, rain-slicked streets reflecting lights." Seedance 2.0 responds well to cinematographic direction: dolly, pan, tilt, crane, tracking shot, static wide, handheld.
Lighting makes or breaks a shot. Specify: "golden hour backlight," "overcast diffused light," "dramatic side-lit chiaroscuro," or "warm tungsten interior." The model uses lighting cues to set mood and realism.
Static scenes work, but motion sells. Add action: "steam rises from the mug," "hair blows in the wind," "leaves fall in slow motion." Describe the subject's movement and the camera's movement separately for best results.
If you want to avoid certain artifacts, you can include a negative_prompt parameter: "blurry, distorted, watermark." But most of the time, a well-written positive prompt produces clean output without needing negation.
For rapid iteration and prompt testing, use 480p ($0.10/sec, under 30s generation). When you have a prompt you are happy with, re-generate at 1080p ($0.50/sec) for production quality. This workflow keeps costs low during development.
Text-to-video generation is billed per second of output at your chosen resolution:
No subscriptions. No minimums. Prepaid balance — add funds and start generating. Volume discounts available for accounts spending $500+/month.
Seedance 2.0 is ByteDance's flagship video generation model. It excels at:
Visit our homepage demo section to see real API output — unedited, no cherry-picking.

Register, add funds, and generate your first text-to-video clip in under 60 seconds.
Get Your API Key →