Face Swapping Evolved: Mastering the Art of 3D Avatar Generation with PulID Flux

CN
ComfyUI.org
2025-04-07 15:23:46

1. Workflow Overview

m9780t9s77tih0p8s7m231189e82c651e9ad05e8faac2979860bd903a912b544d8f9106609f4901f1e4.png

This is a PulID Flux-based face swapping workflow with two modes:

  1. Text-to-Image: Generate 3D avatars from text descriptions

  2. Image-to-Image: Convert real faces to 3D cartoon style
    Core Pipeline: Face detection β†’ Style transfer β†’ Image composition

2. Core Models

Model Name

Function

Installation

PulID Flux v0.9.0

High-precision face fusion

Manual .safetensors download

Cutie 3D Model_v1

3D cartoon style generator

Via LoraLoader

Meta-Llama-3.1-8B

Image captioning (JoyCaption)

Requires 4bit quantized version

3. Key Components

3.1 Custom Nodes

- `PulID Flux Nodes`: Install from GitHub repo
- `JoyCaption`: Requires LLM model (~4GB VRAM)
- `FluxGuidance`: Dynamic CFG control module

3.2 Critical Nodes

  • ApplyPulidFlux (ID:1): Core face-swapping node

  • SamplerCustomAdvanced (ID:35/16): Enhanced sampler with noise control

  • ConcatTextOfUtils (ID:50): Dynamic prompt concatenation

4. Workflow Structure

Group 1: Text-to-Image

  • Input: Text prompt + random seed

  • Process: CLIP encoding β†’ 3D generation β†’ Face fusion

  • Output: 1024x768 3D avatar

Group 2: Image-to-Image

  • Input: Photo + optional mask

  • Process: Face detection β†’ Style transfer

  • Output: 3D-style converted image

5. I/O Specifications

Input Parameters:

{
  "text_prompt": "3D,cartoon,school uniform", # Required  
  "seed": -1,                        # Random seed  
  "reference_image": "optional.jpg"   # For img2img mode  
}

Output: PNG with metadata

6. Important Notes

  1. Hardware Requirements:

    • Minimum 8GB VRAM (12GB for Llama)

    • CUDA acceleration required

  2. Troubleshooting:

    • Face detection failure β†’ Check InsightFace model

    • Style mismatch β†’ Adjust Lora strength (0.7-1.2 recommended)

  3. Optimization Tips:

    • Use attn_mask for precise face replacement

    • Disable live preview for batch processing