workflow

Purpose: Motion retargeting from a source video to a target character using Wan2.1-Fun-Control model.
Key Tech:
- Pose Extraction: DWPreprocessor detects keypoints from input video.
- Multimodal Control: CLIP vision + T5 text + depth maps (DepthAnythingPreprocessor).
- Temporal Coherence: WanFunControlToVideo generates frame-consistent videos.

2. Core Models

Model Name	Function
Wan2.1-Fun-Control-14B	Base motion control model (14B params, FP8 optimized).
umt5-xxl_fp8_e4m3fn_scaled	Text encoder for prompts (e.g., negative prompts to filter bad frames).
depth_anything_vitl14	Depth preprocessor for spatial consistency.

3. Key Nodes

3.1 Input Processing

VHS_LoadVideo:
- Loads input video (e.g., 5月12日 0.8.mp4), extracts frames (25FPS default).
LoadImage:
- Loads target character image (e.g., 00088-3677135724.png).

3.2 Motion Analysis

DWPreprocessor:
- Extracts pose keypoints (using yolox_l.onnx and dw-ll_ucoco_384).
DepthAnythingPreprocessor:
- Generates depth maps for background alignment.

3.3 Video Generation

WanFunControlToVideo:
- Key params: 832x480 output, 81 frames (~3.24s), CFG=1.0.
- Inputs: Pose keypoints + CLIP features + text conditioning.
KSampler:
- Settings: 20 steps, Euler sampler, fixed seed (198).

3.4 Post-Processing

SkipLayerGuidanceWanVideo:
- Skips UNet layers (9,10) at 0.2 strength for detail/fluency balance.
WanVideoEnhanceAVideoKJ:
- Reduces flickering (strength=0.2).

4. Workflow Structure

Stage	Key Nodes	Function
Input Prep	VHS_LoadVideo + LoadImage	Loads video and target image.
Motion Extract	DWPreprocessor → DepthAnything	Extracts poses and depth maps.
Conditioning	CLIPTextEncode + CLIPVisionEncode	Encodes text/visual conditions.
Video Gen	WanFunControlToVideo → KSampler	Renders motion-retargeted frames.
Output Export	VHS_VideoCombine	Final video (H.264, CRF=15).

5. Inputs & Outputs

Inputs:
- Source video (MP4, 25FPS recommended).
- Target character image (PNG/JPG, transparent background preferred).
- Optional text prompts (style control).
Output:
- Motion-retargeted video (default 832x480, 25FPS).

6. Notes

Hardware:
- 16GB+ VRAM (RTX 4080+ recommended for 14B model).
- Enable FP8 optimization (fp8_e4m3fn) for lower VRAM usage.
Dependencies:
- Download Wan2.1-Fun-Control-14B and depth_anything_vitl14.pth manually.
Troubleshooting:
- Reduce flickering: Increase KSampler steps (20→30) or lower SkipLayerGuidance strength (0.2→0.1).
- Resolution errors: Match video/image aspect ratios (e.g., 512x512).

Unlock Next-Level Animation: First-Frame Controlled Video Generation Pipeline

Unlock Time-Lapse Aging Videos with Wan2.1 I2V Model: A Step-by-Step Guide

Recommend

Transforming Line Art into 3D-Style Renders: A Deep Dive into ControlNet and Dual CLIP Encoding

Unlock Stunning Art: Transform line art into vibrant illustrations & 3D-style renders with ControlNet-guided generation & super-resolution. Learn how to use this AI workflow for breathtaking results.

Unlock Liquid Magic: Advanced I2V Workflow for Stunning Visual Effects

Generate Stunning Liquid Collision Videos with I2V Workflow! Discover how to combine WanVideo's custom models with GIMM-VFI for breathtaking effects. Learn more and start creating now!

Master Local Edits & Style Transfers with This Cutting-Edge Workflow

Unlock AI-powered image editing: Local inpainting, style transfer & auto-upscaling with ICEdit, Flux, and ESRGAN models. Try now and transform your images!

The Art of Revival: Using AI to Restore Historical Portraits from Paintings and Statues

Transform historical figures into realistic portraits with ControlNet and Stable Diffusion. Learn how to utilize line art, depth, and pose control to achieve stunning results.

Unlock MidJourney-Style Image Generation with Multi-LoRA Fusion Workflow

Unlock MidJourney-style image generation with this workflow! Combine multiple LoRAs, auto-captioning, and negative prompt filtering for stunning results. Learn how to replicate this process and take your image creation to the next level

Summary

Discover how to retarget motion from a source video to a target character using the Wan2.1-Fun-Control model, a powerful tool for creating realistic character animations. Learn the workflow, key technologies, and core models involved in this innovative process.

Chapter

workflow:

CustomNodes:

DWPreprocessor VHS_VideoCombin...