Unlock Time-Lapse Aging Videos with Wan2.1 I2V Model: A Step-by-Step Guide

CN
ComfyUI.org
2025-05-27 07:38:50

1. Workflow Overview

mb67fwg4sz1gfs0zjjf5b2697a59c4bde85fa2d3f039fd766c44425b15c10a5a934266f663447010fb.gif
  • Purpose: Generate time-lapse aging videos from a single portrait using Wan2.1 I2V model.

  • Key Tech:

    • Wan2.1-I2V-14B: Image-to-video diffusion model (480P output).

    • Aging LoRA: 老化延时摄影 (Wan2.1 I2V LoRA)_v1.0 enhances facial aging effects.

    • Multimodal Control: CLIP vision encoder + T5 text encoder for precise conditioning.

2. Core Models

Model Name

Function

Wan2_1-I2V-14B-480P_fp8_e4m3fn

Base video generation model (input: image → output: frames).

Aging LoRA

Adds wrinkles, gray hair, and skin texture changes (strength=1.0).

umt5-xxl-enc-bf16

Text encoder for prompts (e.g., "a monk gradually ages").

3. Key Nodes

3.1 Model Loading

  • WanVideoModelLoader:

    • Loads base model (place .safetensors in models/wan_video).

    • Settings: bf16 precision, fp8_e4m3fn optimization, VRAM management via offload_device.

  • WanVideoLoraSelect:

    • Applies aging LoRA (requires 老化延时摄影.safetensors).

3.2 Input Processing

  • LoadImage:

    • Input portrait (e.g., "Asian child monk in gray robe.jpeg").

  • WanVideoImageClipEncode:

    • Extracts image features via CLIP vision encoder (image_embeds).

3.3 Text Control

  • WanVideoTextEncode:

    • Positive prompt (describes aging process) + Negative prompt (avoids artifacts).

    • Example:

      Positive: "A child monk in gray robe slowly ages from youth to old age..."  
      Negative: "Overexposed, static frames, blurry details..."  

3.4 Video Generation

  • WanVideoSampler:

    • Key params: 25 steps, dpm++_sde sampler, fixed seed (1057359483639287).

  • VHS_VideoCombine:

    • Renders frames to MP4 (16FPS, H.264, CRF=19, output: output.mp4).

4. Workflow Structure

Stage

Key Nodes

Function

Model Load

WanVideoModelLoader + LoraSelect

Loads base model and aging LoRA.

Input Encoding

LoadImage → ImageClipEncode

Encodes image and text conditions.

Video Generation

WanVideoSampler → WanVideoDecode

Generates and decodes latent frames.

Video Export

VHS_VideoCombine

Outputs final video file.

5. Inputs & Outputs

  • Inputs:

    • Single portrait image (e.g., 768x1024 JPEG).

    • Text prompts (describe aging process).

  • Output:

    • MP4 video (default 480x272, 16FPS).

6. Notes

  1. Hardware:

    • Recommended: 12GB+ VRAM (RTX 3060 Ti or higher).

  2. Dependencies:

    • Manual downloads required:

      • Base model: Wan2_1-I2V-14B-480P_fp8_e4m3fn.safetensors

      • LoRA: 老化延时摄影.safetensors

      • VAE: Wan2_1_VAE_bf16.safetensors

  3. Troubleshooting:

    • Reduce steps (25→20) or resolution if frames lag.

    • Adjust LoRA strength (default 1.0) for natural aging.