Unlock Seamless Outfit Transitions with WAN2.1 Video Model & GIMM-VFI Interpolation
1. Workflow Overview

Designed for Douyin-style smooth outfit transition videos, this workflow uses WAN2.1 video model with GIMM-VFI interpolation to achieve 60fps output. Features image-to-video conversion, dynamic LoRA control, smart block swapping, and dual-format export (MP4/GIF).
2. Core Models
Model Name | Purpose |
---|---|
Wan2_1-I2V-14B-720P_fp8_e4m3fn | Main model (14B params, 720P output) |
WAN2.1 ZOEY Outfit LoRA | Controls outfit transition (strength=1.0) |
gimmvfi_r_arb_lpips_fp32 | Frame interpolation model (3x multiplier) |
3. Critical Nodes
WanVideoSLG (Spatial Local Guidance)
Param:
blocks=9
(processes 9 image blocks separately)
GIMMVFI_interpolate
Key Param:
interpolation_factor=3
Install: Requires
ComfyUI-GIMM-VFI
WanVideoBlockSwap
Swaps 10 attention blocks to maintain background stability
4. Pipeline Structure

graph TB
A[Input Image] --> B[Feature Extraction]
B --> C[Block-controlled Generation]
C --> D[3x Interpolation]
D --> E[Dual-format Export]
5. I/O Specification
Input Requirements:
Image Size: 1024×1440 (auto-adjusted to 720P)
Recommended: Upper-body portraits
Output:
30fps MP4 (H.264)
16fps GIF (looping)
6. Important Notes
⚠️ Mandatory Setup:
git clone https://github.com/Kijai/ComfyUI-WanVideoWrapper
⚠️ VRAM Requirements:
Base Generation: ≥8GB
Interpolation: ≥12GB
⚠️ Key Features:Uses
offload_device
for low-VRAM operationPreserves facial features during transitions