Unlock Stunning Video Generation with Style Control: A Comprehensive Workflow Guide
1. Workflow Overview

This workflow is designed for video generation with style control, leveraging Alibaba's InP (Intelligent Processing) model to enhance details. Key features:
Start/End Frame Driven: Generates intermediate frames between two input images.
FunControl: Dynamic style interpolation via
WanVideoBlockSwap.Multimodal Support: Integrates CLIP vision encoding, T5 text encoding, and VAE decoding.
Core Models:
Wan2.1-Fun-InP-14B: 14B-parameter video model with FP8 quantization (VRAM-optimized).
umt5-xxl-enc: Multilingual T5 text encoder for complex prompts.
OpenCLIP-ViT-H: Vision encoder for image feature extraction.
2. Key Components
Critical Nodes:
WanVideoModelLoader:
Function: Loads the main model (
Wan2.1-Fun-InP-14B_fp8_e4m3fn.safetensors).Installation: Manually download and place in
ComfyUI/models/wan_video/.Dependency: FP8 requires NVIDIA Ampere/Ada GPUs.
WanVideoClipVisionEncode:
Function: Encodes start/end frames using OpenCLIP.
Model:
open-clip-xlm-roberta-large-vit-huge-14_fp16.safetensors(from HuggingFace).
WanVideoSampler:
Function: Controls sampling steps (30), CFG scale (6), and motion intensity (
slg_args).
VHS_VideoCombine:
Function: Renders frame sequences to MP4/GIF (H.264 supported).
Installation: Requires
ComfyUI-VideoHelperSuiteplugin.
3. Workflow Groups
Group Logic:
Group 1: Frame Processing
Input: Two images (loaded via
LoadImage).Process: Resize to 640x480 (
ImageResizeKJ), add labels (AddLabel).
Group 2: Video Generation
Input: Encoded image features + text prompts.
Output: Latent sequence (decoded by
WanVideoDecode).
Group 3: Post-Processing
Output: Final video (MP4/GIF) + previews (with metadata).
4. Inputs & Outputs
Input Parameters:
Images: Start/end frames (recommended ≥480p).
Text Prompt: English descriptions (e.g., "digital wireframe video with lightsaber").
Performance Args:
teacache_args: VRAM optimization (threshold 0.3).experimental_args: Interpolation modes (2,3).
Output:
Video Files: MP4 (H.264) and GIF formats.
Previews: Labeled frame comparisons.
5. Notes
VRAM: Minimum 16GB (24GB for FP8).
Compatibility:
Switch to
1.3Bmodel if VRAM insufficient (editWanVideoModelLoader).Missing models trigger console download prompts.
Debugging:
Frame size mismatch crashes
ImageConcatMulti.Avoid non-English prompts (T5 trained on English).