Create Breathtaking Mini Worlds with Capsule Micro LoRA and Stable Diffusion

CN
ComfyUI.org
2025-05-07 01:14:58

1. Workflow Overview

mad8xc2r0novv1yx8wek372369197335fe435acb50360792b2577a70087725b741a682aa6d3b8d021a39.png

This workflow generates miniature cityscapes (e.g., Qingdao skyline inside a pill capsule) by combining Stable Diffusion with a "Capsule Micro World" LoRA. It includes text encoding, latent space generation, and 2x upscaling for HD output.


2. Core Models

  • Stable Diffusion (UNETLoader): Base model (基础算法_F.1), FP8 precision.

  • Capsule Micro World LoRA: Enhances miniature details (weight=0.8).

  • T5-XXL & CLIP-L: Dual text encoders for prompt understanding.

  • 2xNomosUni Upscaler: Post-processing 2x super-resolution.


3. Key Nodes

Node Name

Function

Installation

Dependencies

DualCLIPLoader

Loads T5+CLIP text encoders

Built-in

Requires t5xxl_fp8_e4m3fn

Lora Loader Stack

Dynamic LoRA loading

Install rgthree-comfy

Needs LoRA files

SamplerCustomAdvanced

Advanced noise-controlled sampler

Built-in

None

ImageUpscaleWithModel

2x image upscaling

Built-in

Requires 2xNomosUni model


4. Workflow Structure

  • Group 1: Text Input

    • CLIPTextEncode: Processes prompts (e.g., "Shanghai micro-city in a capsule").

  • Group 2: Image Generation

    • EmptyLatentImage: Sets resolution (768x1024).

    • SamplerCustomAdvanced: Generates latent image with LoRA.

  • Group 3: Post-Processing

    • VAEDecode: Decodes latent to image.

    • ImageUpscaleWithModel: 2x upscaling.


5. Inputs & Outputs

  • Input Parameters:

    • Required: Prompts, resolution (768x1024).

    • Optional: Seed (random by default), LoRA weight (default=0.8).

  • Output: Upscaled PNG (saved to wangyi AI Studio folder).


6. Notes

  1. Plugin: Install rgthree-comfy via ComfyUI Manager for LoRA stacking.

  2. Model Paths: Place LoRA (胶囊微缩世界) and upscaler (2xNomosUni) in correct folders.

  3. Performance: ≥8GB VRAM recommended; use fp8_e4m3fn for lower resource usage.

  4. Debug: Check CLIP model if prompts fail to encode.