Unlock Stunning Images: A Step-by-Step Guide to Flux.1-Based Text-to-Image Generation
Workflow Overview

This workflow is a Text-to-Image (T2I) generation process based on the Flux.1 model, designed to create high-quality, high-resolution images from text prompts. It integrates Flux.1-dev, LoRA enhancement, and multilingual support (e.g., translation), producing images with specific styles (e.g., Japanese temple architecture). The final output is a 1024x1280 image, suitable for artistic or professional photography-style applications.
Core Models
Flux.1-dev (flux1-dev.sft)
Function: An efficient T2I model excelling in detailed, realistic image generation.
Source: Download from Flux official channels (e.g., Hugging Face), place in ComfyUI/models/unet/.
LoRA: flux-lora-建筑3D立体剪纸-03.safetensors
Function: A fine-tuning model adding a 3D papercut architectural style to Flux.1.
Source: Obtain from communities (e.g., Civitai) or custom training, place in ComfyUI/models/loras/.
T5-XXL (t5xxl_fp16.safetensors)
Function: A robust text encoder converting complex prompts into embeddings.
Source: Download from ComfyUI official or Hugging Face, place in ComfyUI/models/text_encoders/.
CLIP-L (clip_l.safetensors)
Function: A lightweight CLIP model, paired with T5-XXL for prompt encoding.
Source: Download from ComfyUI official or Hugging Face, place in ComfyUI/models/clip/.
VAE (ae.safetensors)
Function: Variational Autoencoder decoding latents into images.
Source: Download from Flux official channels, place in ComfyUI/models/vae/.
Component Explanation
DualCLIPLoader
Purpose: Loads T5-XXL and CLIP-L text encoders.
Function: Prepares dual CLIP encoding for Flux, enhancing prompt comprehension.
Installation: Built into ComfyUI.
Dependencies: Requires t5xxl_fp16.safetensors and clip_l.safetensors.
CLIPTextEncode
Purpose: Encodes text prompts into conditioning inputs.
Function: Outputs conditioning data from CLIP and text for generation.
Installation: Built into ComfyUI.
EmptyLatentImage
Purpose: Creates an empty latent as the starting point for image generation.
Function: Sets output resolution to 1024x1280.
Installation: Built into ComfyUI.
KSamplerAdvanced
Purpose: Performs sampling for the Flux model.
Function: Generates images using DPM++ 2M sampler, 30 steps.
Installation: Built into ComfyUI.
LoraLoaderModelOnly
Purpose: Loads and applies the LoRA model to Flux.1.
Function: Integrates LoRA at 0.8 strength for style enhancement.
Installation: Built into ComfyUI.
Dependencies: Requires flux-lora-建筑3D立体剪纸-03.safetensors.
VAEDecode
Purpose: Decodes latents into the final image.
Function: Outputs image data using VAE.
Installation: Built into ComfyUI.
FluxGuidance
Purpose: Adjusts positive conditioning strength.
Function: Sets guidance to 3.5, controlling prompt adherence.
Installation: Included in Flux support package.
ConditioningZeroOut
Purpose: Creates an empty negative condition.
Function: Ensures generation relies solely on positive prompts.
Installation: Built into ComfyUI.
UNETLoader
Purpose: Loads the Flux.1 UNet model.
Function: Provides the core generation network.
Installation: Built into ComfyUI.
Dependencies: Requires flux1-dev.sft.
VAELoader
Purpose: Loads the VAE model.
Function: Supports image decoding.
Installation: Built into ComfyUI.
Dependencies: Requires ae.safetensors.
SaveImage
Purpose: Saves the generated image.
Function: Saves as PNG with “ComfyUI” prefix.
Installation: Built into ComfyUI.
DeepTranslatorTextNode
Purpose: Translates input prompts.
Function: Converts English to Chinese (e.g., “Product on a white sink…”), supports multilingual input.
Installation: Requires ComfyUI_Custom_Nodes_AlekPet, install via ComfyUI Manager (search “AlekPet”) or GitHub (https://github.com/AlekPet/ComfyUI_Custom_Nodes_AlekPet).
ShowText|pysssss
Purpose: Displays translated text.
Function: Useful for debugging or verifying translations.
Installation: Requires ComfyUI-Custom-Scripts, install via ComfyUI Manager (search “Custom-Scripts”) or GitHub (https://github.com/pysssss/ComfyUI-Custom-Scripts).
CR Prompt Text
Purpose: Provides initial text prompt input.
Function: Outputs user-defined prompts, supports multiline text.
Installation: Requires ComfyUI_Comfyroll_CustomNodes, install via ComfyUI Manager (search “Comfyroll”) or GitHub (https://github.com/RockOfFire/ComfyUI_Comfyroll_CustomNodes).
Workflow Structure
Prompt Input and Translation Group
Nodes: CR Prompt Text → DeepTranslatorTextNode → ShowText|pysssss
Role: Inputs English prompts and translates them to Chinese for debugging or reference.
Input Parameters: English prompt (e.g., “A highly detailed, red-toned digital illustration…”).
Output: Translated Chinese prompt (e.g., “一个高度详细的红色数字插图…”).
Model Loading and Encoding Group
Nodes: UNETLoader → LoraLoaderModelOnly → DualCLIPLoader → CLIPTextEncode
Role: Loads Flux.1, LoRA, and CLIP encoders, encoding the prompt into conditioning.
Input Parameters: Prompt, model paths, LoRA strength (0.8).
Output: Encoded positive conditioning.
Conditioning Adjustment Group
Nodes: FluxGuidance → ConditioningZeroOut
Role: Adjusts positive conditioning strength (3.5) and creates an empty negative condition.
Input Parameters: Encoded conditioning.
Output: Adjusted positive and negative conditioning.
Image Generation Group
Nodes: EmptyLatentImage → KSamplerAdvanced → VAEDecode
Role: Generates a 1024x1280 latent, samples it, and decodes it into an image.
Input Parameters: Resolution (1024x1280), steps (30), guidance (3.5).
Output: High-quality image.
Output Group
Nodes: SaveImage
Role: Saves the generated image.
Input Parameters: Generated image data.
Output: PNG image file.
Inputs and Outputs
Expected Inputs:
Text prompt: Multiline English description (e.g., “A highly detailed, red-toned digital illustration…”).
Resolution: 1024x1280.
Seed: Random (349017919967907).
Sampling steps: 30.
Guidance: 3.5.
LoRA strength: 0.8.
Final Output:
1024x1280 high-quality image, saved as PNG (prefix “ComfyUI”).
Notes and Tips
Resource Requirements: Flux.1 requires significant VRAM (12GB+ recommended); use fp8 versions if VRAM is limited.
Model Files: Ensure all files (flux1-dev.sft, t5xxl_fp16.safetensors, etc.) are correctly placed, or errors will occur.
Performance Optimization: Reduce steps (e.g., from 30 to 20) if generation is slow.
Plugin Installation: Install AlekPet, Custom-Scripts, and Comfyroll plugins, or translation/prompt input will fail.
Translation: DeepTranslatorTextNode uses Google Translate; ensure network access or configure a proxy.