From Abstract to Stunning: Mastering AI-Driven Image Generation with LoRA Style Control and Captioning

CN
ComfyUI.org
2025-05-20 07:05:36

1. Workflow Overview

maw66i15z59paeep4947562ce4bf54aa1e5fb5c7246e22dd5b8638b20b82c1085cb7a33c60d8ea8597.png

This workflow is designed for image padding and style enhancement, integrating image captioning, LoRA style control, and text-to-image generation. Key uses:

  • Style Transfer: Generate stylized images based on reference input (e.g., abstract art).

  • Detail Enhancement: Apply LoRAs (e.g., Anime-Chinese Beauty FLUX_1.0) for specific styles.

  • Multilingual: Supports mixed Chinese/English prompts.

Core Models:

  • F.1-fp8 11G: Base model (VRAM-optimized).

  • Meta-Llama-3.1-8B: Image captioning.

  • CatPaw_Anime-ChineseBeauty_FLUX_1.0: Style LoRA.


2. Key Components

Critical Nodes:

  1. Joy_caption_two:

    • Uses Meta-Llama-3 to generate image descriptions (e.g., abstract line art).

    • Install via ComfyUI Manager (unsloth/Meta-Llama-3.1-8B-Instruct).

  2. LoraLoader:

    • Loads style LoRAs (e.g., Anime-Chinese Beauty), adjustable strength (default: 0.8).

  3. CLIPTextEncodeFlux:

    • Merges user prompts (e.g., miluo_cjsj, cloth) with captions for conditioning.

  4. KSampler:

    • Settings:

      • Steps: 20

      • Sampler: euler

      • Seed: Random (can fix to 6368394736575).

Dependencies:

  • Download F.1-fp8 and ae.sft VAE to ComfyUI/models.


3. Workflow Structure

  1. Input Group (Group 2):

    • Load image (e.g., @rawandrendered.jpg) → Caption → Translate.

  2. Generation Group (Group 1):

    • Fuse prompts + captions → Apply LoRA → Generate image (600x800).

  3. Output:

    • Decode latent → Preview/save image.

Key Parameters:

  • Resolution: Set via EmptyLatentImage (default: 600x800).

  • LoRA Strength: Adjust via ReroutePrimitive (default: 0.8).


4. Input & Output

Input Parameters:

  • Image: JPG/PNG (e.g., 1440x1440 abstract art).

  • Text Prompt: Optional keywords (e.g., miluo_cjsj, cloth).

  • LoRA: Select from preset styles.

Output:

  • Stylized image (e.g., Chinese anime style) in PreviewImage.

  • Example caption:

    "Digital artwork with abstract colorful lines, deep blue background, reflective effects..."


5. Notes

  • VRAM: ≥8GB required (FP8 optimization).

  • Troubleshooting:

    • Missing Joy_caption_two? Install comfyui_slk_joy_caption_two.

    • Match image size to EmptyLatentImage (e.g., 600x800).

  • Style Control:

    • Adjust LoRA strength (0-1) for intensity.

    • Modify CFG scale (default: 3.5) in CLIPTextEncodeFlux.