Videovidu-q3vidu-q3-text-to-video

Vidu Q3

Vidu Q3 video with native audio. Text-to-video, add a start frame to animate a still (image-to-video), or 1–7 reference images for reference-to-video. 360p/540p flat; 720p/1080p at 2.2×.

Open in Video Lab Browse all models~129 cr per run

Vidu guideVerified

Shengshu Tech's video model. Reference-to-video pioneer — designed around blending multiple input images into a consistent moving subject.

Strengths

Best multi-image reference-to-video in the field — 1–9 references blend cleanly.
Subject consistency across shots when references are well-chosen.
Strong start-and-end-frame mode (vidu-q3-pro-first-last-frames).

Weaknesses

Pure text-to-video quality is mid-pack.
Camera motion is less controllable than Runway / Kling.

Best for

Character / costume consistency across multiple shots
Product variations — same item shot from different angles
Start+end frame animations where you control both poles

Prompting tips

Provide reference images that match the desired LIGHTING and ANGLE — Vidu blends literally.
Describe each @image in the prompt ("@image1 from above", "@image2 close-up") for precise blending.

Parameters

image_url
string
Optional. Add to animate a still (image-to-video).
end_image_url
string
Optional. Guides the final frame (image-to-video).
reference_image_urls
array
Optional. 1–7 images → reference-to-video for character/scene consistency.
duration
int
No description.
default: 5
range: 4 … 8
aspect_ratio
string
No description.
16:99:161:1
default: 16:9
resolution
string
No description.
360p540p720p1080p
default: 540p
audio
boolean
No description.
default: true

You'll need

A text prompt
Start frameoptional
Reference imagesoptional
End frameoptional

Try now

More from vidu-q3

Vidu Q3 Turbo

vidu-q3-turbo-text-to-video