workflow

This workflow aims to generate high-quality oil painting-style images using Flux.1 (an enhanced Stable Diffusion model) while incorporating Florence-2 for image-to-text captioning to produce descriptive prompts. It operates in two main phases:

Image Captioning Phase: Generates a detailed text description from an input image (A2.png).
Image Generation Phase: Uses the generated prompt with Flux.1 and multiple LoRA models to create an oil painting-style image.

Core Models

Flux.1:
- Function: Advanced diffusion-based model for high-resolution image generation.
- Source: Download from official sources (e.g., Hugging Face), such as flux1-dev.sft.
VAE (Variational Autoencoder):
- Function: Decodes Latent representations into final images, improving quality.
- Source: Uses ae.sft, manually downloaded and placed in models/vae.
Florence-2:
- Function: Image-to-text model for generating detailed captions.
- Source: Downloaded via DownloadAndLoadFlorence2Model from microsoft/Florence-2-large.

Component Explanation

UNETLoader (ID: 10):
- Purpose: Loads the Flux.1 UNet model (基础算法_F.1).
- Installation: Default ComfyUI node, requires manual model file placement.
DualCLIPLoader (ID: 11):
- Purpose: Loads Flux-specific CLIP models (t5xxl_fp8_e4m3fn and clip_l).
- Installation: Requires Flux support plugin (via ComfyUI Manager or GitHub).
CLIPTextEncodeFlux (ID: 74):
- Purpose: Encodes text prompts into conditioning vectors for Flux.
- Installation: Flux-specific node, requires Flux plugin.
KSampler (ID: 22):
- Purpose: Performs sampling to generate Latent representations.
- Installation: Core ComfyUI component, default.
VAEDecode (ID: 20):
- Purpose: Decodes Latent into an image.
- Installation: Default node.
LoraLoaderModelOnly (ID: 31, 76, 77, 78):
- Purpose: Loads multiple LoRA models to enhance Flux with oil painting styles.
- Installation: Default node; LoRA files (e.g., 油画厚涂风格 Oil Painting_FLUX_FLUX_F.1) from Civitai or Hugging Face.
DownloadAndLoadFlorence2Model (ID: 81):
- Purpose: Downloads and loads the Florence-2 model.
- Installation: Requires Florence-2 plugin (e.g., ComfyUI-Florence2).
Florence2Run (ID: 80):
- Purpose: Runs Florence-2 to generate image captions.
- Installation: Same as above.
StringFunction|pysssss (ID: 73):
- Purpose: Processes Florence-2 captions, appending additional descriptions.
- Installation: Requires pysssss custom node (from GitHub).

Workflow Structure

Captioning Group (Group: 反推):
- Role: Generates descriptive prompts from an input image.
- Nodes: LoadImage → DownloadAndLoadFlorence2Model → Florence2Run → StringFunction|pysssss.
- Inputs: Image A2.png.
- Outputs: Text description (e.g., “A romantic oil painting...”).
Model Loading Group:
- Role: Loads Flux.1 and enhancing LoRA models.
- Nodes: UNETLoader → multiple LoraLoaderModelOnly.
- Inputs: Model and LoRA file paths.
- Outputs: Enhanced model.
Text Encoding Group:
- Role: Encodes text prompts into conditions.
- Nodes: DualCLIPLoader → CLIPTextEncodeFlux (positive prompt), CLIPTextEncode (negative prompt).
- Inputs: Positive prompt (from captioning), negative prompt (NSFW).
- Outputs: Conditioning vectors.
Image Generation Group:
- Role: Generates the final image.
- Nodes: EmptyLatentImage → KSampler → VAEDecode → PreviewImage.
- Inputs: Resolution (904x1600), sampling parameters.
- Outputs: Oil painting-style image.

Inputs and Outputs

Input Parameters:
- Image: A2.png (placed in input folder).
- Resolution: 904x1600 (set by EmptyLatentImage).
- Seed: Randomized (set in KSampler).
- Prompt: Auto-generated and enhanced by Florence-2.
Output Results:
- PNG image in oil painting style, sized 904x1600.

Notes and Considerations

Model Files: Ensure Flux.1 (基础算法_F.1), VAE (ae.sft), and LoRA files are in the models folder.
Plugin Installation: Install Flux and Florence-2 plugins; missing plugins may cause node errors.
Performance: Flux.1 and Florence-2 require significant GPU power (12GB+ VRAM recommended).
Troubleshooting: Check file paths if “model not found” errors occur; ensure plugin compatibility.

Discover the Ultimate Eastern Art Creation Workflow with AI

From Script to Screen: A Step-by-Step Guide to Miyazaki-Style Storyboards

Recommend

MimicMotion Explained: How to Use Diffusion Models for Animation in ComfyUI

Generate animated videos with MimicMotion: Transform reference images and pose sequences into seamless MP4 animations. Explore the workflow now!

Unlock Stunning Images: A Step-by-Step Guide to Flux.1-Based Text-to-Image Generation

Unlock high-quality image generation with Flux.1! Discover a Text-to-Image workflow integrating LoRA enhancement and multilingual support, producing stunning 1024x1280 images. Learn how to harness Flux.1-dev, T5-XXL, CLIP-L, and VAE for artistic and professional photography-style applications.

"Unlocking Artistic Potential: A Deep Dive into the Flux.1 and Florence-2 Workflow"

Generate stunning oil painting-style images with Flux.1 & Florence-2. Learn how to harness AI for art creation & discover the power of image-to-text captioning. Dive into this workflow now!

Beyond the Frame: A Step-by-Step Workflow for FLUX Model Image Outpainting

Unlock the full potential of your images with FLUX model outpainting. Extend borders, fill missing parts, and enhance quality using Stable Diffusion techniques and AI-powered tools. Learn how in this workflow guide.

Boost Your Image Generation Game with Stable Diffusion, JOY Caption Two, and LORA

Unlock AI-powered image generation with Stable Diffusion, JOY Caption Two, and FLUX. Discover how to reverse-engineer prompts from reference images and create stunning new visuals. Learn more and start creating now!

Summary

Generate stunning oil painting-style images with Flux.1 & Florence-2. Learn how to harness AI for art creation & discover the power of image-to-text captioning. Dive into this workflow now!

Chapter

workflow:

CustomNodes:

VAEDecode VAELoader CLIPTextEn...