Bringing Figurines to Life: The WAN2.1 I2V Workflow Guide
1. Workflow Overview

This WAN2.1 I2V workflow specializes in transforming static figurine images into animated videos. Key features:
Input: A PNG image (e.g.,
ComfyUI_temp_sydys_00036_.png) → Output: 3D rotation/transformation videoSupports background preservation and motion control (via
SLGArgsnode)Exports MP4 video (720P, 16FPS by default)
2. Core Models
Model Name | Function |
|---|---|
| Main I2V temporal generation model |
| Multilingual text encoder (supports Chinese) |
| Video latent space decoder |
3. Key Nodes
Node Name | Function | Installation |
|---|---|---|
| Loads main model + LoRA (figurine style) | Requires |
| Processes prompts (e.g., "transform into 3D figurine") | Same as above |
| Controls motion amplitude ( | Same as above |
| Video rendering (requires | Install via ComfyUI Manager |
4. Workflow Groups
Model Loading Group
Nodes:
LoadWanVideoT5TextEncoder,WanVideoVAELoaderInput: Model paths (e.g.,
umt5-xxl-enc-bf16.safetensors)Output: Initialized encoders + VAE
Image Processing Group
Node:
WanVideoImageClipEncodeInput: 1024x1440 PNG image (transparent background recommended)
Output: Image embeddings
Video Generation Group
Nodes:
WanVideoSampler+WanVideoDecodeParams: Seed
34660692369907, Steps20Output: Latent video sequence
Video Export Group
Node:
VHS_VideoCombineParams: FPS
16, CRF19Output:
WanVideo2_1_00001.mp4
5. Inputs & Outputs
Inputs:
Image: 1024x1440 PNG (transparent background)
Prompt:
"A girl rotates 360° and transforms into a 3D figurine"Negative Prompt:
"Low quality, background change, distorted limbs"
Output:
720P MP4 video (H.264, YUV420P)
6. Notes
Hardware Requirements:
VRAM ≥16GB (due to
bf16+fp8_e4m3fnoptimization)Launch with
--highvramflag
Troubleshooting:
Model Path Error: Ensure
.safetensorsare inComfyUI/models/wanvideo/Video Flickering: Adjust
WanVideoTeaCacheparameter0.3