Creating Blazing Videos with WAN2.1: A Workflow for Image-to-Video Transformation
1. Workflow Overview

This is a WAN2.1-based video generation workflow specializing in:
Image-to-Video: Transform static images into dynamic fire/flame effects.
Advanced Lighting: Simulate fluid fire motion via WAN2.1-I2V model.
Post-Processing: Includes frame interpolation (RIFE), upscaling (CR Upscale), and video synthesis.
2. Core Models
Model Name | Function |
|---|---|
WAN2.1-I2V-14B-480P | Main model for image-to-video conversion (requires |
Fire Flame LoRA | Fine-tunes flame fluid effects (download to |
4xRealWebPhoto_v4 | Upscales video resolution (install via ComfyUI Manager). |
3. Key Nodes
Node Name | Function | Installation |
|---|---|---|
| Loads WAN2.1 video generation model. | Manual model download required. |
| Applies flame effect LoRA. | Built-in (requires LoRA file). |
| Controls video sampling (frames/seed). | Built-in. |
| Frame interpolation (smoother motion). | Install |
| 4x video upscaling. | Install |
Dependencies:
Models:
Main:
Wan2_1-I2V-14B-480P_fp8_e4m3fn.safetensors→models/wan_video.LoRA:
fire_fluid_lora_480V1.0.safetensors→models/loras.
Plugins: Install
ComfyUI-WanVideoWrapperandComfyUI-VideoHelperSuite.
4. Workflow Structure
Group 1: Load Image & Models
Input: Static image (e.g., portrait).
Output: Image embeddings + model loaded.
Key Nodes:
LoadImage,WanVideoModelLoader.
Group 2: Prompt & Effect Control
Input: Positive prompt (e.g., "woman engulfed in flames") + negative prompt.
Output: Conditioned text/image embeddings.
Key Nodes:
WanVideoTextEncode,WanVideoImageClipEncode.
Group 3: Video Generation & Post-Processing
Input: Conditioned embeddings + model params.
Output: Raw video + interpolated/upscaled result.
Key Nodes:
WanVideoSampler,RIFE VFI,CR Upscale Image.
5. Input & Output
Input Parameters:
Image: Recommended resolution
480x704.Prompts: Describe flame effects (e.g., "dynamic orange fire").
Frame Rate: Default
16fps, up to32fpsafter interpolation.
Output:
Video file (MP4), optionally upscaled to
960p.
6. Notes
VRAM: ≥16GB GPU recommended (video generation + interpolation are intensive).
Common Errors:
Missing model files trigger node errors (verify paths/filenames).
Incorrect image aspect ratio may cause distortion (use
ImageScaleByAspectRatio).
Optimization:
Enable
offload_deviceinteacache_argsto reduce VRAM usage.Lower
WanVideoSamplersteps (e.g., 30 → 20) for faster generation.