workflow

This workflow utilizes Aliyun's Wan2.1 model for Text-to-Video (T2V) generation. It integrates text encoding, video diffusion, and VAE decoding to produce dynamic video content. Key features:

Supports Chinese prompts (e.g., "滑雪的男人" - "a man skiing")
Configurable frame rate (default: 16fps) and resolution (480x768)
Includes negative prompts for quality filtering

2. Core Models

Model Name	Function	Installation
Wan2.1-T2V-1.3B	Video diffusion backbone	Manual download (`.safetensors`)
umt5-xxl-enc	Chinese text encoder	Place in `models/wan_t5`
Wan2.1_VAE	Latent space decoder	Manual download

3. Key Nodes

LoadWanVideoT5TextEncoder
Loads the Chinese text encoder (umt5-xxl-enc). Use bf16 precision to save VRAM.
WanVideoTextEncode
Processes positive/negative prompts. Example negative prompts filter low-quality content.
WanVideoModelLoader
Loads the main video model with options for fp32/fp16 and VRAM optimization.
WanVideoSampler
Core sampler parameters:
- steps: 10 (lower for faster video generation)
- cfg_scale: 6 (lower for creative freedom)
- sampler: dpm++
VHS_VideoCombine
Combines frames into MP4 video with configurable:
- Frame rate (16fps)
- Output format (H.264, CRF=19)
- Filename prefix (WanVideo2_1_T2V)

4. Workflow Structure

Group 1: Text Processing

Input: Chinese prompt
Output: Text embeddings
Key nodes: LoadWanVideoT5TextEncoder → WanVideoTextEncode

Group 2: Video Generation

Input: Text embeds + empty image embeds (480x768)
Output: Latent video data
Key nodes: WanVideoSampler

Group 3: Video Export

Input: Decoded image sequence
Output: MP4 file
Key nodes: WanVideoDecode → VHS_VideoCombine

5. I/O Specifications

Input Parameters:

Resolution: 480x768 (set in WanVideoEmptyEmbeds)
Seed: Fixed/Random (example: 1057359483639287)
Prompts: Natural Chinese language (avoid complex syntax)

Output:

MP4 video (saved to ComfyUI output folder)
Includes generation metadata

6. Notes

⚠️ VRAM Requirements

Minimum 12GB (16GB recommended)
Enable offload_device for optimization

⚠️ Model Installation

Download Wan2.1 models manually from official sources
Text encoder path: models/wan_t5/umt5-xxl-enc-bf16.safetensors

⚠️ Dependencies

Requires ComfyUI-WanVideoWrapper & VideoHelperSuite
Install via ComfyUI Manager

Unlock 3D Magic: A Step-by-Step Workflow for Converting 2D Line Art

Unlock Lip-Synced Cartoon Avatar Videos with This AI-Powered Workflow

Recommend

Boost Your Image Generation Game with Stable Diffusion, JOY Caption Two, and LORA

Unlock AI-powered image generation with Stable Diffusion, JOY Caption Two, and FLUX. Discover how to reverse-engineer prompts from reference images and create stunning new visuals. Learn more and start creating now!

Transforming Line Art into 3D-Style Renders: A Deep Dive into ControlNet and Dual CLIP Encoding

Unlock Stunning Art: Transform line art into vibrant illustrations & 3D-style renders with ControlNet-guided generation & super-resolution. Learn how to use this AI workflow for breathtaking results.

Create Adorable Cat Videos with AI: A Low-VRAM Workflow

Generate cute cat videos from static images with this workflow! Learn how to create high-quality MP4s using low VRAM and fast local processing. Discover the power of image-to-video, super-resolution, and frame interpolation.

Unlock Liquid Magic: Advanced I2V Workflow for Stunning Visual Effects

Generate Stunning Liquid Collision Videos with I2V Workflow! Discover how to combine WanVideo's custom models with GIMM-VFI for breathtaking effects. Learn more and start creating now!

"Blossoming Architecture: AI-Generated Images that Will Amaze You"

Unlock AI-powered image generation! Discover a workflow that combines architecture and flowers, using Stable Diffusion, ControlNet, and more to create stunning, high-resolution visuals. Learn how to bring your images to life with blooming flowers and rejuvenation effects.

Summary

Generate dynamic videos with text prompts using Aliyun's Wan2.1 model! Learn how to utilize this Text-to-Video workflow with Chinese support, customizable frame rates, and resolutions. Discover the core models, key nodes, and workflow structure.

Chapter

workflow:

CustomNodes:

LoadWanVideoT5TextEncoder WanV...