Creating Blazing Videos with WAN2.1: A Workflow for Image-to-Video Transformation

CN
ComfyUI.org
2025-05-02 05:19:18

1. Workflow Overview

ma6cit2zg8a16x5ode8ab44d813e35a39ca997958a4b99618c237a9a9fb54d9da445b64181625ab667.gif

This is a WAN2.1-based video generation workflow specializing in:

  • Image-to-Video: Transform static images into dynamic fire/flame effects.

  • Advanced Lighting: Simulate fluid fire motion via WAN2.1-I2V model.

  • Post-Processing: Includes frame interpolation (RIFE), upscaling (CR Upscale), and video synthesis.

2. Core Models

Model Name

Function

WAN2.1-I2V-14B-480P

Main model for image-to-video conversion (requires .safetensors file).

Fire Flame LoRA

Fine-tunes flame fluid effects (download to models/loras).

4xRealWebPhoto_v4

Upscales video resolution (install via ComfyUI Manager).

3. Key Nodes

Node Name

Function

Installation

WanVideoModelLoader

Loads WAN2.1 video generation model.

Manual model download required.

WanVideoLoraSelect

Applies flame effect LoRA.

Built-in (requires LoRA file).

WanVideoSampler

Controls video sampling (frames/seed).

Built-in.

RIFE VFI

Frame interpolation (smoother motion).

Install ComfyUI-VideoHelperSuite.

CR Upscale Image

4x video upscaling.

Install ComfyUI_Comfyroll_CustomNodes.

Dependencies:

  • Models:

    • Main: Wan2_1-I2V-14B-480P_fp8_e4m3fn.safetensorsmodels/wan_video.

    • LoRA: fire_fluid_lora_480V1.0.safetensorsmodels/loras.

  • Plugins: Install ComfyUI-WanVideoWrapper and ComfyUI-VideoHelperSuite.

4. Workflow Structure

  • Group 1: Load Image & Models

    • Input: Static image (e.g., portrait).

    • Output: Image embeddings + model loaded.

    • Key Nodes: LoadImage, WanVideoModelLoader.

  • Group 2: Prompt & Effect Control

    • Input: Positive prompt (e.g., "woman engulfed in flames") + negative prompt.

    • Output: Conditioned text/image embeddings.

    • Key Nodes: WanVideoTextEncode, WanVideoImageClipEncode.

  • Group 3: Video Generation & Post-Processing

    • Input: Conditioned embeddings + model params.

    • Output: Raw video + interpolated/upscaled result.

    • Key Nodes: WanVideoSampler, RIFE VFI, CR Upscale Image.

5. Input & Output

  • Input Parameters:

    • Image: Recommended resolution 480x704.

    • Prompts: Describe flame effects (e.g., "dynamic orange fire").

    • Frame Rate: Default 16fps, up to 32fps after interpolation.

  • Output:

    • Video file (MP4), optionally upscaled to 960p.

6. Notes

  • VRAM: ≥16GB GPU recommended (video generation + interpolation are intensive).

  • Common Errors:

    • Missing model files trigger node errors (verify paths/filenames).

    • Incorrect image aspect ratio may cause distortion (use ImageScaleByAspectRatio).

  • Optimization:

    • Enable offload_device in teacache_args to reduce VRAM usage.

    • Lower WanVideoSampler steps (e.g., 30 → 20) for faster generation.