Unlock Seamless Image Expansion with Flux Diffusion and Janus AI

CN
ComfyUI.org
2025-04-09 04:54:51

1. Workflow Overview

m99gg65zpegxagm2w55b00436ba616a0d7b424257248b0840ee81164e69523acf70743afa01b2420ae.gif

This workflow specializes in image outpainting using Flux Diffusion and Janus image understanding, enabling seamless extension of images. It supports multi-directional expansion (top/bottom/left/right) while maintaining consistency with the original style.

2. Core Models

Model Name

Description

F.1-Fill-fp16_Inpaint&Outpaint

UNet optimized for inpainting/outpainting at high resolution.

deepseek-ai/Janus-Pro-1B

Multimodal model for image captioning (auto-prompt generation).

ae.sft

Custom VAE for improved image decoding.

Flux Guidance

Dynamically guides diffusion for natural and coherent outpainting.

3. Key Nodes & Installation

  • JanusModelLoader

    • Function: Loads Janus-Pro model for image analysis.

    • Install: Install Janus-Nodes via ComfyUI Manager or clone GitHub repo.

  • ImagePadForOutpaint

    • Function: Defines expansion area (in pixels) and generates mask.

    • Install: Built-in node (no installation needed).

  • FluxGuidance

    • Function: Adjusts guidance strength (default=30) to prevent artifacts.

    • Install: Requires Flux-Diffusion plugin (search in ComfyUI Manager).

  • DifferentialDiffusion

    • Function: Combines base and refiner models for detail enhancement.

    • Dependency: Download F.1-Fill-fp16 and place in models/unet.

4. Workflow Structure

Group Name

Description

Upload Image

Load input image (PNG/JPG).

Max Resolution

Constrains output size (default: 1024x1024) to avoid VRAM issues.

Outpaint Area

Set expansion pixels (e.g., left=104, right=104) to generate mask.

Prompt Generation

Janus auto-generates captions, or manually input English prompts.

Batch Control

Repeats latent samples (default=3) for stable results.

Flux Workspace

Core nodes (KSampler, VAE Decode) with default optimized parameters.

5. Inputs & Outputs

  • Inputs:

    • Image file (e.g., output (2).png).

    • Pixel values for expansion (e.g., left=104).

    • Optional text prompts (auto-generated if empty).

  • Output:

    • Upscaled image (PNG) with expanded regions.

6. Notes

  • VRAM: β‰₯12GB GPU recommended (e.g., RTX 3060 Ti).

  • Tips:

    • Limit expansion to ≀300 pixels per step; split large expansions into multiple steps.

    • Avoid single-direction expansion (e.g., only downward) to balance composition.

  • Troubleshooting:

    • Reduce resolution in ConstrainImage or batch size if CUDA OOM occurs.

    • Manually input prompts if Janus fails to generate captions.