Kling 3.0
Cinematic Kling 3.0 video. Add a start frame to animate an image (image-to-video), or just prompt for text-to-video. Multi-shot prompts, native audio, and 720p / 1080p / 4K.
Kuaishou's video model. The strongest motion in the open API ecosystem — fluid human bodies, action choreography, dramatic camera moves. v3.0 Pro is the flagship.
Kling's flagship. Text-to-video, or attach a start frame to animate a still (image-to-video). 720p/1080p/4K.
Pick this for finals. Use kling-v2.5-turbo-pro-t2v for cheaper drafts.
- Best-in-class human and animal motion fidelity — runners, dancers, fighters look anatomically right.
- Action choreography: martial arts, sports, combat scenes hold together over the full clip.
- Camera moves: smooth dollies, crane shots, orbits without warping the subject.
- Native multi-shot mode in v3 (the API exposes it as multi_prompt).
- Long context for prompts — accepts dense scene descriptions without losing the lead subject.
- Text rendering inside images is unreliable — avoid signs / readable writing.
- Hands and fingers can still distort on fast motion.
- Output durations capped at ~10s per single clip (use multi-shot to extend).
- Slow vs. fast tiers vary a lot in quality; Standard is often blurry compared to Pro.
- Action sequences, fight choreography, sports highlights
- Anime / OVA-style narrative shots (pairs well with the Anime Styles panel)
- Music video moments — performers, dancers, crowd shots
- Camera-heavy cinematic establishing shots
- Tight close-ups with text (logos, signs, captions)
- Long single-shot videos beyond ~10s — use multi-shot instead
- Lead with the subject + action, then describe the camera move, then the lighting and mood.
- Be explicit about motion verbs — "runs", "jumps", "pivots", "orbits" — Kling rewards specificity.
- For consistent characters across shots, repeat the same description verbatim each shot.
- Use "cinematic" / "35mm anamorphic" / "golden hour" trailing tags — Kling respects them.
- Avoid abstract direction ("epic", "amazing") — replace with concrete cinematography.
- Duration: 5s is the sweet spot. 10s costs a lot more and often introduces drift.
- Aspect ratio: 16:9 is what Kling was trained on; 9:16 works but loses some quality.
- Pro vs Standard: always pick Pro for finals. Use Standard only for cheap drafts.
anime cel-shaded, 90s OVA aesthetic, hand-drawn line art, soft halation, 35mm grain
noir high contrast, chiaroscuro lighting, anamorphic lens, smoke and rain, desaturated teal-orange
- start_image_urlstringOptional. Add an image to animate it (image-to-video). Leave empty for text-to-video.
- end_image_urlstringOptional. Guides the final frame (image-to-video only).
- elementsarrayInject characters / objects via reference images or video clips. Reference in prompt as @Element1, @Element2.
- multi_promptarraySequence of shots, each with its own prompt and duration. When set, overrides the single prompt.
- durationstringNo description.3456789101112131415default: 5
- resolutionstringNo description.StandardPro4Kdefault: Pro
- aspect_ratiostringNo description.16:99:161:1default: 16:9
- generate_audiobooleanNative audio with Chinese + English voice output.default: true
- shot_typestring'intelligent' = model picks shot structure; 'customize' = you control via multi_prompt.customizeintelligentdefault: customize
- negative_promptstringNo description.default: blur, distort, and low quality
- cfg_scalenumberHigher = stricter prompt adherence.default: 0.5range: 0 … 1
Made with Kling 3.0
Real renders from the studio — open one for its full recipe.