Create Breathtaking Architectural Videos with Our Advanced Low-Memory Solution
1. Workflow Overview

A low-VRAM animation generator for architectural scenes featuring:
Long Sequences: 60s generation with 6GB VRAM via
FramePack
Memory Optimization: Tiled decoding + temporal slicing
Multimodal Control: CLIP vision + text prompts for motion
Prompt Assistant: Built-in template in
Note
node
2. Core Models
Model | Function | Source |
---|---|---|
| Video diffusion (BF16) | |
| Lightweight VAE | Manual download |
| Visual encoder | Auto-installed |
3. Key Nodes
Node | Purpose | Installation |
---|---|---|
| Frame-wise sampling | |
| Memory-efficient decoding | Built-in (enable |
| Video rendering | |
| Smart resizing | ComfyUI Manager |
4. Pipeline Stages
Stage 1: Input Processing
Image Input: Load via
LoadImage
(e.g.,work-04.jpg
)Resolution Matching: Auto-optimize with
FramePackFindNearestBucket
Feature Extraction:
CLIPVisionEncode
encodes visual cues
Stage 2: Animation Core
Sampling:
30 steps, CFG=10, UniPC-BH1 sampler
VRAM optimizations:
teacache
(0.15) +temporal_size=64
Motion Control: Text prompts (e.g., "slow zoom-in")
Stage 3: Output
Tiled Decoding: 128x128 blocks via
VAEDecodeTiled
Video Export: MP4 output (30FPS, H.264)
5. Inputs & Outputs
Required Inputs:
Static architecture image
Motion prompts (Chinese preferred)
Outputs:
MP4 video (
FramePack_00001.mp4
)Resolution: 512x512 to 1024x1024
6. Critical Notes
VRAM Management:
Reduce
total_second_length
(1GB/10s)Enable
gpu_memory_preservation
(default=6)
Dependencies:
Download
FramePackI2V_HY
andhunyuan_video_vae_bf16
CLIP models auto-download
Troubleshooting:
CUDA OOM
→ Lowerlatent_window_size
(default=9)Choppy video → Ensure
temporal_overlap
≥8