Unlock the Power of Text-to-Video Generation with Aliyun's Wan2.1 Model
1. Workflow Overview

This workflow utilizes Aliyun's Wan2.1 model for Text-to-Video (T2V) generation. It integrates text encoding, video diffusion, and VAE decoding to produce dynamic video content. Key features:
Supports Chinese prompts (e.g., "滑雪的男人" - "a man skiing")
Configurable frame rate (default: 16fps) and resolution (480x768)
Includes negative prompts for quality filtering
2. Core Models
Model Name | Function | Installation |
---|---|---|
Wan2.1-T2V-1.3B | Video diffusion backbone | Manual download ( |
umt5-xxl-enc | Chinese text encoder | Place in |
Wan2.1_VAE | Latent space decoder | Manual download |
3. Key Nodes
LoadWanVideoT5TextEncoder
Loads the Chinese text encoder (umt5-xxl-enc
). Usebf16
precision to save VRAM.WanVideoTextEncode
Processes positive/negative prompts. Example negative prompts filter low-quality content.WanVideoModelLoader
Loads the main video model with options forfp32
/fp16
and VRAM optimization.WanVideoSampler
Core sampler parameters:steps
: 10 (lower for faster video generation)cfg_scale
: 6 (lower for creative freedom)sampler
: dpm++
VHS_VideoCombine
Combines frames into MP4 video with configurable:Frame rate (16fps)
Output format (H.264, CRF=19)
Filename prefix (
WanVideo2_1_T2V
)
4. Workflow Structure
Group 1: Text Processing
Input: Chinese prompt
Output: Text embeddings
Key nodes:
LoadWanVideoT5TextEncoder
→WanVideoTextEncode
Group 2: Video Generation
Input: Text embeds + empty image embeds (480x768)
Output: Latent video data
Key nodes:
WanVideoSampler
Group 3: Video Export
Input: Decoded image sequence
Output: MP4 file
Key nodes:
WanVideoDecode
→VHS_VideoCombine
5. I/O Specifications
Input Parameters:
Resolution: 480x768 (set in
WanVideoEmptyEmbeds
)Seed: Fixed/Random (example:
1057359483639287
)Prompts: Natural Chinese language (avoid complex syntax)
Output:
MP4 video (saved to ComfyUI output folder)
Includes generation metadata
6. Notes
⚠️ VRAM Requirements
Minimum 12GB (16GB recommended)
Enable
offload_device
for optimization
⚠️ Model Installation
Download Wan2.1 models manually from official sources
Text encoder path:
models/wan_t5/umt5-xxl-enc-bf16.safetensors
⚠️ Dependencies
Requires
ComfyUI-WanVideoWrapper
&VideoHelperSuite
Install via ComfyUI Manager