Unlock High-Speed Image Generation: Nunchaku Flux Workflow Revealed

CN
ComfyUI.org
2025-04-16 14:58:35

1. Workflow Overview

m9k23g9pqbprlhhus8aac3db79e21c62d84aa0b0f250db4172409950e5806ad3d8c9f625b0fab670d39.jpg
  • Purpose:
    A high-speed image imitation & optimization workflow based on Nunchaku Flux, featuring:

    • Image-to-prompt (via JoyCaption).

    • Fast image generation (using quantized FluxDiT).

    • Image scaling & post-processing (e.g., super-resolution).

    • GPU cleanup (easy cleanGpuUsed).

  • Core Models:

    • Nunchaku FluxDiT: Quantized model (svdq-int4-flux) for speed.

    • CLIP Encoder: t5xxl_fp8_e4m3fn.safetensors + clip_l.safetensors.

    • Upscaler: 4x-UltraSharp.pth.

    • VAE: ae.safetensors.

2. Key Nodes

  • JoyCaption: Uses Meta-Llama-3.1-8B to generate prompts from images.

  • FluxGuidance: Controls conditioning strength (default 3.5).

  • ModelSamplingFlux: Configures resolution (1024x1024) and quantization.

  • KSampler: Generates images in 20 steps with euler sampler.

  • ImageScaleByAspectRatio: Resizes images with lanczos interpolation.

3. Workflow Structure

  1. Group 1: Nunchaku Text-to-Image

    • Input: Prompt (auto-generated or manual).

    • Process: CLIP encode β†’ FluxDiT β†’ VAE decode β†’ Save.

    • Output: Final image (e.g., cyberpunk cat).

  2. Group 2: Image Loading & Prompt Reverse

    • Input: Uploaded image (e.g., 2794c263...jpg).

    • Process: Upscale β†’ Caption β†’ Pass to Group 1.

    • Output: Text description (e.g., "a neon cat").

4. Inputs & Outputs

  • Inputs: Image (optional), resolution (1024x1024), steps (20).

  • Outputs: Image saved to ComfyUI/output.

5. Notes

  • Dependencies:

  • Errors:

    • Missing model files: Check paths for t5xxl_fp8_e4m3fn.safetensors.

  • Optimization:

    • Use svdq-int4 for lower VRAM usage.


Recommend