Image → Videovidu-q2vidu-q2-reference
Vidu Q2 Reference
Vidu Q2 Reference Video generates breathtaking cinematic clips from text prompts guided by multiple reference images. Each image refines the model’s understanding of subject, environment, and visual tone — ensuring perfect consistency in appearance and motion across every frame.
Vidu guideVerified
Shengshu Tech's video model. Reference-to-video pioneer — designed around blending multiple input images into a consistent moving subject.
Strengths
- Best multi-image reference-to-video in the field — 1–9 references blend cleanly.
- Subject consistency across shots when references are well-chosen.
- Strong start-and-end-frame mode (vidu-q3-pro-first-last-frames).
Weaknesses
- Pure text-to-video quality is mid-pack.
- Camera motion is less controllable than Runway / Kling.
Best for
- Character / costume consistency across multiple shots
- Product variations — same item shot from different angles
- Start+end frame animations where you control both poles
Prompting tips
- Provide reference images that match the desired LIGHTING and ANGLE — Vidu blends literally.
- Describe each @image in the prompt ("@image1 from above", "@image2 close-up") for precise blending.
Parameters
- images_listarrayUpload or provide image urls. Used for image-to-video generation.
- resolutionstringThe resolution of the generated video.360p540p720p1080pdefault: 720p
- aspect_ratiostringAspect ratio of the output video.16:99:164:33:41:1default: 16:9
- durationintThe duration of the generated video in secondsdefault: 5range: 2 … 8
- movement_amplitudestringThe movement amplitude of objects in the frame.autosmallmediumlargedefault: auto
More from vidu-q2
Vidu Q2 Text To Image
vidu-q2-text-to-image
Vidu Q2 Reference To Image
vidu-q2-reference-to-image
Vidu Q2 Pro Text To Video
vidu-q2-pro-text-to-video
Vidu Q2 Turbo Text To Video
vidu-q2-turbo-text-to-video
Vidu Q2 Pro Image To Video
vidu-q2-pro-image-to-video
Vidu Q2 Pro Start End Video
vidu-q2-pro-start-end-video