Videovidu-q2vidu-q2-pro-text-to-video
Vidu Q2 Pro Text To Video
Vidu Q2 Pro Text-to-Video generates cinematic, prompt-faithful clips from text alone with strong temporal consistency and rich detail at up to 1080p. Pick this when you need polished output without a reference frame.
Vidu guideVerified
Shengshu Tech's video model. Reference-to-video pioneer — designed around blending multiple input images into a consistent moving subject.
Strengths
- Best multi-image reference-to-video in the field — 1–9 references blend cleanly.
- Subject consistency across shots when references are well-chosen.
- Strong start-and-end-frame mode (vidu-q3-pro-first-last-frames).
Weaknesses
- Pure text-to-video quality is mid-pack.
- Camera motion is less controllable than Runway / Kling.
Best for
- Character / costume consistency across multiple shots
- Product variations — same item shot from different angles
- Start+end frame animations where you control both poles
Prompting tips
- Provide reference images that match the desired LIGHTING and ANGLE — Vidu blends literally.
- Describe each @image in the prompt ("@image1 from above", "@image2 close-up") for precise blending.
Parameters
- resolutionstringThe resolution of the generated video.720p1080pdefault: 720p
- aspect_ratiostringAspect ratio of the output video.16:99:161:1default: 16:9
- durationintThe duration of the generated video in seconds.default: 5range: 2 … 8
- bgmbooleanAdd background music to the output. When enabled, duration must be exactly 4 seconds.default: false
- movement_amplitudestringThe movement amplitude of objects in the frame.autosmallmediumlargedefault: auto