Master Depth Control and Style Transfer with this Cutting-Edge Workflow
1. Workflow Overview

This advanced workflow specializes in precise style transfer using depth control and AI-powered image generation. It combines:
Depth-guided image generation
AI style transfer
Automatic prompt generation
High-quality output rendering
Key applications include portrait enhancement, artistic style transfer, and controlled image generation.
2. Core Models
Model | Function | Source |
---|---|---|
F.1_Depth-fp16_1.0 | Base UNet for depth-aware generation | Custom |
flux1-redux-dev | Style transfer model | Requires installation |
HyperL-F.1-加速器-PAseer_加速FLUX_AcceleratorV3.1 (LoRA) | Generation accelerator | Requires installation |
depth_anything_v2_vitl.pth | Depth estimation | ControlNet Aux |
unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit | Prompt generation | Joy_caption_two |
3. Key Components
Node | Function | Installation |
---|---|---|
DepthAnythingV2Preprocessor | Depth map generation | ComfyUI-ControlNet-Aux |
StyleModelLoader | Loads style transfer model | Built-in |
Joy_caption_two | AI-powered prompt generation | Custom/GitHub |
InstructPixToPixConditioning | Image-to-image conditioning | Built-in |
FluxGuidance | Enhanced guidance scaling | Built-in |
DF_Image_scale_to_side | Smart image scaling | Derfuu's Modded Nodes |
4. Workflow Structure
Input Section:
"照片——放这里": Source image input (658×1170)
"风格参考图——放这里": Style reference image (452×600)
Prompt Generation:
Uses Llama-3 to analyze images and generate descriptive prompts
Depth Control:
Creates depth maps for controlled generation
Processes at 1216px resolution
Style Transfer:
Applies style model with CLIP vision encoding
Uses flux1-redux-dev style model
Generation:
8 sampling steps with Euler method
1216px output resolution
5. Input/Output
Inputs:
Source image (PNG/JPG)
Style reference image (optional)
Prompt (can be AI-generated or manual)
Outputs:
Stylized image with depth control
Intermediate depth maps
Generated prompts
6. Technical Notes
Requires significant VRAM (recommended 12GB+)
Uses fp8 precision for some models
Depth processing at 1216px may be memory-intensive
Style model applies multiply blending
7. Installation Requirements
Required custom nodes:
ComfyUI-ControlNet-Aux (for depth processor)
Derfuu's Modded Nodes (for smart scaling)
Joy_caption_two (for prompt generation)
Model downloads needed:
depth_anything_v2_vitl.pth
flux1-redux-dev style model
HyperL-F.1 LoRA