97+ MODELS LIVE — TRY THE AUDIO LAB
STUDIO
Image → Videowan2.7wan2.7-reference-to-video

Wan2.7 Reference To Video

Alibaba WAN 2.7 Reference-to-Video. Reference characters/props to generate new shots.

Wan guideVerified

Alibaba's open-source video model line (Wan 2.1 → 2.7). Strong prompt adherence; the open-source pedigree means heavy community use and well-documented prompting patterns.

Strengths
  • Best-in-class prompt adherence — does what you ask, not what it thinks you want.
  • Wide variant family covers most needs (T2V, I2V, reference, motion control, lipsync).
  • Wan 2.5 and 2.6 catch up to closed-source quality at lower cost.
  • Wan 2.2 Spicy variants for adult creative work.
Weaknesses
  • Older versions (2.1, 2.2) look dated next to current flagship.
  • Stylization quality lags behind Kling and Hailuo.
Best for
  • Precise prompt-driven scene construction
  • Hybrid pipelines where Wan does the heavy lifting and another model polishes
Prompting tips
  • Treat Wan like a brief — itemize what's in frame, the action, the camera.
  • Wan does NOT need flowery language; plain descriptive prose works better.
Parameters
  • num_frames
    int
    Number of frames to generate. Must be between 81 to 241 (inclusive).
    default: 81
    range: 17 241
  • num_interpolated_frames
    int
    Number of frames to interpolate between the original frames. A value of 0 means no interpolation.
    default: 0
    range: 0 5
  • num_inference_steps
    int
    Number of inference steps for sampling. Higher values give better quality but take longer.
    default: 30
    range: 2 50
  • first_frame_url
    string
    URL to the first frame of the video. If provided, the model will use this frame as a reference.
  • resolution
    string
    Resolution of the generated video.
    auto240p360p480p580p720p
    default: auto
  • frames_per_second
    int
    Frames per second of the generated video. Must be between 5 to 30. Ignored if match_input_frames_per_second is true.
    default: 16
  • last_frame_url
    string
    URL to the last frame of the video. If provided, the model will use this frame as a reference.
  • match_input_frames_per_second
    boolean
    If true, the frames per second of the generated video will match the input video. If false, the frames per second will be determined by the frames_per_second parameter.
    default: true
  • video_url
    string
    URL to the source video file. This video will be used as a reference for the reframe task.
  • guidance_scale
    number
    Guidance scale for classifier-free guidance. Higher values encourage the model to generate images closely related to the text prompt.
    default: 5
    range: 1 10
  • shift
    number
    Shift parameter for video generation.
    default: 5
    range: 1 15
  • video_write_mode
    string
    The write mode of the generated video.
    fastbalancedsmall
    default: balanced
  • temporal_downsample_factor
    int
    Temporal downsample factor for the video. This is an integer value that determines how many frames to skip in the video. A value of 0 means no downsampling. For each downsample factor, one upsample factor will automatically be applied.
    default: 0
    range: 0 5
  • transparency_mode
    string
    The transparency mode to apply to the first and last frames. This controls how the transparent areas of the first and last frames are filled.
    content_awarewhiteblack
    default: content_aware
  • auto_downsample_min_fps
    number
    The minimum frames per second to downsample the video to. This is used to help determine the auto downsample factor to try and find the lowest detail-preserving downsample factor. The default value is appropriate for most videos, if you are using a video with very fast motion,…
    default: 15
    range: 1 60
  • zoom_factor
    number
    Zoom factor for the video. When this value is greater than 0, the video will be zoomed in by this factor (in relation to the canvas size,) cutting off the edges of the video. A value of 0 means no zoom.
    default: 0
    range: 0 0.9
  • negative_prompt
    string
    Negative prompt for video generation.
    default: letterboxing, borders, black bars, bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards
  • sampler
    string
    Sampler to use for video generation.
    unipcdpm++euler
    default: unipc
  • interpolator_model
    string
    The model to use for frame interpolation. Options are 'rife' or 'film'.
    rifefilm
    default: film
  • acceleration
    string
    Acceleration to use for inference. Options are 'none' or 'regular'. Accelerated inference will very slightly affect output, but will be significantly faster.
    default: regular
  • match_input_num_frames
    boolean
    If true, the number of frames in the generated video will match the number of frames in the input video. If false, the number of frames will be determined by the num_frames parameter.
    default: true
  • enable_prompt_expansion
    boolean
    Whether to enable prompt expansion.
    default: false
  • return_frames_zip
    boolean
    If true, also return a ZIP file containing all generated frames.
    default: false
  • seed
    int
    Random seed for reproducibility. If None, a random seed is chosen.
  • trim_borders
    boolean
    Whether to trim borders from the video.
    default: true
  • aspect_ratio
    string
    Aspect ratio of the generated video.
    auto16:91:19:16
    default: auto
  • video_quality
    string
    The quality of the generated video.
    lowmediumhighmaximum
    default: high
  • enable_auto_downsample
    boolean
    If true, the model will automatically temporally downsample the video to an appropriate frame length for the model, then will interpolate it back to the original frame length.
    default: false
You'll need
  • A text prompt
  • Start frame
  • End frameoptional
  • Reference video
Try now

More from wan2.7