Skip to main content
Wan 2.7

Wan 2.7 -- Generate from reference

Submit a Wan 2.7 reference-to-video task

POST
/services/aigc/video-generation/video-synthesis
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
  -H 'X-DashScope-Async: enable' \
  -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "wan2.7-r2v",
  "input": {
    "prompt": "Video 2 holds Image 3 and plays a soothing American country ballad in a coffee shop, while Video 1 smiles, watches Video 2, and slowly walks towards him",
    "media": [
      {"type": "reference_video", "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260129/hfugmr/wan-r2v-role1.mp4"},
      {"type": "reference_video", "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260129/qigswt/wan-r2v-role2.mp4"},
      {"type": "reference_image", "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260129/qpzxps/wan-r2v-object4.png"}
    ]
  },
  "parameters": {
    "resolution": "720P",
    "duration": 10,
    "prompt_extend": false,
    "watermark": true
  }
}'
{
  "request_id": "<string>",
  "output": {
    "task_id": "<string>",
    "task_status": "PENDING"
  }
}
Generate natural, lifelike performance videos from multimodal input (text, image, video) using the Wan 2.7 model (wan2.7-r2v).
  • Character portrayal: Replicate a character's appearance from a reference image or video. Reference videos also replicate voice timbre. Supports single or multi-character performances with up to 5 reference assets.
  • Media array input: Provide reference images, videos, or a first frame via the media array. Use Video 1/Image 1 in prompts to reference characters by their order. Images and videos are counted separately.
  • Multi-panel storyboard: Describe multi-shot narratives with time segments (e.g., Scene 1 [0-3s]: ...). Provide key shots and the model automatically recognizes the panel logic.
  • Voice cloning: Provide a reference_voice audio file to set the voice timbre. If not specified, audio from the reference video is used by default.
  • Resolution and ratio: Set output quality with resolution (720P/1080P) and aspect ratio with ratio (16:9, 9:16, 1:1, 4:3, 3:4). When a first_frame image is provided, ratio is inferred from the image.
  • Prompt enhancement: Enable prompt_extend to rewrite the prompt with an LLM. Improves results for shorter prompts but increases processing time.

Authorizations

string
header
required

DashScope API Key. Create one in the Qwen Cloud console.

Header Parameters

enum<string>
required

Must be enable to create an asynchronous task.

enable

Body

application/json
enum<string>
required

Model name.

wan2.7-r2v
wan2.7-r2v
object
required

Input data for Wan 2.7 reference-to-video generation.

object

Generation parameters for Wan 2.7 reference-to-video.

Response

200-application/json
string

Unique request identifier.

object