Face Swapping Evolved: Mastering the Art of 3D Avatar Generation with PulID Flux
1. Workflow Overview

This is a PulID Flux-based face swapping workflow with two modes:
Text-to-Image: Generate 3D avatars from text descriptions
Image-to-Image: Convert real faces to 3D cartoon style
Core Pipeline: Face detection β Style transfer β Image composition
2. Core Models
Model Name | Function | Installation |
---|---|---|
PulID Flux v0.9.0 | High-precision face fusion | Manual .safetensors download |
Cutie 3D Model_v1 | 3D cartoon style generator | Via LoraLoader |
Meta-Llama-3.1-8B | Image captioning (JoyCaption) | Requires 4bit quantized version |
3. Key Components
3.1 Custom Nodes
- `PulID Flux Nodes`: Install from GitHub repo
- `JoyCaption`: Requires LLM model (~4GB VRAM)
- `FluxGuidance`: Dynamic CFG control module
3.2 Critical Nodes
ApplyPulidFlux (ID:1): Core face-swapping node
SamplerCustomAdvanced (ID:35/16): Enhanced sampler with noise control
ConcatTextOfUtils (ID:50): Dynamic prompt concatenation
4. Workflow Structure
Group 1: Text-to-Image
Input: Text prompt + random seed
Process:
CLIP encoding β 3D generation β Face fusion
Output: 1024x768 3D avatar
Group 2: Image-to-Image
Input: Photo + optional mask
Process:
Face detection β Style transfer
Output: 3D-style converted image
5. I/O Specifications
Input Parameters:
{
"text_prompt": "3D,cartoon,school uniform", # Required
"seed": -1, # Random seed
"reference_image": "optional.jpg" # For img2img mode
}
Output: PNG with metadata
6. Important Notes
Hardware Requirements:
Minimum 8GB VRAM (12GB for Llama)
CUDA acceleration required
Troubleshooting:
Face detection failure β Check InsightFace model
Style mismatch β Adjust Lora strength (0.7-1.2 recommended)
Optimization Tips:
Use
attn_mask
for precise face replacementDisable live preview for batch processing