ComfyUI E-commerce Product Animation: Boost Sales with I2V Technology

CN
ComfyUI.org
2025-05-20 08:28:15

1. Workflow Overview

maw94i6hanncd2ptqrlf6bc1e0a8b8b128e2dd288c0b5df4fab41f40d787f410684ffeabb1fd98edf63.gif

This workflow generates e-commerce product videos from static images (e.g., jewelry, apparel), highlighting details and wearing effects. Key features:

  • Image-to-Video (I2V): Converts product images into dynamic clips (e.g., subtle bracelet movement on a model's wrist).

  • Smart Cropping: Auto-adjusts input aspect ratio (e.g., 350x350 → 832x480) for video output.

  • Commercial-Grade Output: 30fps H.264 encoding (CRF19) balances quality and file size.

Core Models:

  • Wan2.1-I2V-14B: Main video model (480P, bf16/fp8 mixed precision).

  • UMT5-XXL Text Encoder: Processes Chinese product descriptions.

  • CLIP Vision Encoder: Analyzes image composition/color.


2. Key Components

Critical Nodes:

  1. WanVideoModelLoader:

    • Loads Wan2.1 model (Wan2_1-I2V-14B-480P_fp8_e4m3fn.safetensors) with sdpa attention optimization.

  2. WanVideoTextEncode:

    • Input product prompts, e.g.:

      • Positive: "Bracelet showcase video, highlighting craftsmanship."

      • Negative: "low quality, blurry, distortion"

  3. ImageScaleByAspectRatio V2:

    • Resizes input (e.g., 847b03d5...jpg) with letterbox padding.

  4. WanVideoSampler:

    • Settings:

      • Steps: 25

      • Sampler: unipc (fast convergence)

      • Seed: Random (can fix to 773185989414318).

  5. VHS_VideoCombine:

    • Exports 30fps MP4 (default: AnimateDiff.mp4).

Dependencies:

  • Install ComfyUI-VideoHelperSuite.

  • Download models to:

    • Main model: ComfyUI/models/wan_video

    • VAE: Wan2_1_VAE_bf16.safetensors


3. Workflow Structure

  1. Model Loading:

    • Load video model, text encoder, VAE.

  2. Input Processing (Operation Group):

    • Upload product image → Resize → CLIP encode.

  3. Video Generation:

    • Fuse image features + text → Sampler → Latent frames.

  4. Output:

    • Decode latent → MP4 synthesis (with metadata).

Key Parameters:

  • Resolution: Fixed 480P (832x480 landscape).

  • Frame Rate: 30fps (set in VHS_VideoCombine).


4. Input & Output

Input Parameters:

  • Image: Square/vertical (e.g., 1440x1920), clear product focus.

  • Text Prompt: Concise product highlights (Chinese preferred).

Output:

  • 480P MP4 video (e.g., bracelet animation), saved to ComfyUI/output.


5. Notes

  • VRAM: ≥12GB required (16GB recommended).

  • Image Tips:

    • Clear subject, simple background (pre-cropped).

    • Avoid text/watermarks interfering with CLIP.

  • Troubleshooting:

    • Choppy video? Reduce WanVideoSampler steps (e.g., 20).

    • Distortion? Check ImageScaleByAspectRatio mode is letterbox.