Unlock Traditional Chinese Costume Transformation with WAN2.1 I2V Model

CN
ComfyUI.org
2025-04-28 08:38:34

1. Workflow Overview

ma0tsistluesjmnc8ad2f5a00080c10e2bc1036d30118fb78110eb418f53a758e4cb76d518faa0146a.jpg

This workflow leverages WAN2.1 I2V model to generate videos of traditional Chinese costume transformation. Key features:

  • Image Input: Processes portrait images (e.g., 79-花瓣网.jpg) at 1280x720 resolution.

  • Text Guidance: Uses prompts like "gufengbianshen" to control costume effects.

  • Video Generation: Integrates LoRA (WAN2.1 I2V 国风戏服变身_V1) for dynamic transitions.

  • Post-Processing: Optional frame interpolation (GIMMVFI) and upscaling (2xNomosUni).

Use Cases: Historical role animation, short video effects, cultural content creation.


2. Core Models

Model Name

Function

Installation

Wan2_1-I2V-14B-720P_fp8_e5m2

Main model for 720P video generation

Download to models/wan_video

umt5-xxl-enc-bf16

Multilingual text encoder (supports Chinese)

Place in models/t5

Traditional Costume LoRA

Custom style transfer (weight=1.0)

Store in models/lora

GIMMVFI Interpolation

Frame interpolation (2x by default)

Install via ComfyUI Manager


3. Key Nodes

Node Name

Function

Installation

WanVideoTextEncode

Encodes multilingual prompts

Built-in (requires ComfyUI-WanVideoWrapper)

WanVideoImageClipEncode

Encodes input image to latent features

Same as above

WanVideoSampler

Generates video (25 steps, UniPC sampler)

Same as above

GIMMVFI_interpolate

Frame interpolation (optional)

Requires ComfyUI-GIMM-VFI plugin

ImageSmartSharpen+

Sharpens output video

Install ComfyUI-Essentials


4. Workflow Structure

  1. Input Preprocessing

    • Nodes: LoadImageImageScale

    • Input: Source image (auto-resized to 1280x720).

    • Output: Normalized image.

  2. Feature Encoding

    • Nodes: WanVideoTextEncode + WanVideoImageClipEncode

    • Input: Text prompts + resized image.

    • Output: Joint text-image embeddings.

  3. Video Generation

    • Nodes: WanVideoSamplerWanVideoDecode

    • Params: Seed 797198568875963, CFG=5, 16FPS.

    • Output: Raw video (costume transformation).

  4. Post-Processing (Optional)

    • Interpolation: GIMMVFI boosts to 30FPS.

    • Upscaling: 2xNomosUni enhances resolution.


5. Inputs & Outputs

  • Required Inputs:

    • Image: Full-body portrait (plain background recommended).

    • Prompts: Must include keywords like "traditional costume".

  • Output:

    • Video: MP4 (H.264, 16FPS default, 30FPS interpolated).

    • Save Path: ComfyUI/output/WanVideoWrapper_I2V_xxxxx.mp4.


6. Notes

⚠️ Common Issues:

  • VRAM Overflow: Requires ≥10GB GPU. Enable WanVideoVRAMManagement for optimization.

  • LoRA Failure: Verify filename includes _V1.safetensors.

  • Interpolation Errors: Ensure gimmvfi_r_arb_lpips_fp32.safetensors is in models/gimmvfi.

🔧 Optimization Tips:

  • Disable redundant options in ExperimentalArgs for faster generation.

  • Use ImageUpscaleWithModel for HD output (requires extra VRAM).