Unlock Stunning Architectural Visuals with Stable Diffusion XL Workflow

CN
ComfyUI.org
2025-03-19 10:30:36

🟒 1️⃣ Workflow Overview

m8fs6cfbwn1ib4oxtznc368c47dee403d6c873b29c3d03eff014cd17b57d88db1b4953c8d748bef8578.png

Purpose and Function:
This workflow is specifically designed for architecture design and rendering tasks using Stable Diffusion XL (SDXL) as the base model. It leverages various techniques, including Lora models, ControlNet, IPAdapter, and image processing nodes, to generate and stylize architectural renderings. The workflow enables:

  • Generating high-quality architectural renderings from sketches or reference images.

  • Fine-tuning with Lora models to apply specific architectural styles.

  • Enhancing image consistency and stylization with IPAdapter and CLIP Vision.

  • Controlling depth and outline accuracy using ControlNet.

  • Applying upscaling, tone imitation, and image comparison to refine and verify the output.

Use Cases:

  • Architectural visualization

  • Exterior and interior rendering

  • Converting sketches into architectural renderings

  • Style transfer and detail enhancement


πŸ”₯ 2️⃣ Core Models

βœ… Stable Diffusion XL (SDXL)

  • Model Name: juggernautXL_juggXI_JuggernautXL_XI

  • Function: The main generative model, responsible for creating the final image from the prompts and reference inputs. SDXL provides higher detail and visual fidelity, making it ideal for high-resolution architectural renders.

βœ… Lora Models (Localized Style Control)
The workflow uses several Lora models to fine-tune architectural styles, including:

  • Bilus Commercial Architecture V0.2: Optimized for modern commercial architecture.

  • MIR Architectural Rendering SDXL1.0: Enhances fine details in architectural renders.

  • Flux Turbo Lora: Accelerates generation while maintaining quality.

  • Cityscape SDXL v1.0 and Kiwi SDXL Nonlinear Architecture v1.0: Models for cityscape and abstract architectural styles.

βœ… IPAdapter + CLIP Vision

  • IPAdapter: Integrates reference image features into the generation process, ensuring style consistency.

  • CLIP Vision Model: CLIP-ViT-H-14-laion2B-s32B-b79K extracts image features and encodes them into the workflow.

βœ… ControlNet Model

  • Model Name: controlnet-union-sdxl-1.0-promax.safetensors

  • Function: Utilizes depth maps to precisely control the structure and perspective of the architectural render.


βš™οΈ 3️⃣ Key Components

1️⃣ CheckpointLoaderSimple

  • Function: Loads the primary SDXL model for image generation.

  • Model Name: juggernautXL_juggXI_JuggernautXL_XI

  • Installation:

    • Place .safetensors or .ckpt files in:

    • ComfyUI/models/checkpoints

2️⃣ LoraLoader

  • Function: Loads Lora models and fuses them with the base model for style-specific fine-tuning.

  • Parameters:

    • Weight: Controls the influence of the Lora style (0–1 range).

    • Model: The Lora architecture style model.

  • Installation:

    • Place Lora models in:

    • ComfyUI/models/loras

    • Supported formats: .safetensors or .pt

3️⃣ IPAdapterModelLoader & IPAdapterAdvanced

  • Function: Loads the IPAdapter model and reference image, ensuring stylistic consistency.

  • Model Name: ip-adapter-plus_sdxl_vit-h

  • Installation:

    • Place IPAdapter models in:

    • ComfyUI/models/ipadapter

    • Supported formats: .bin or .safetensors

  • Parameters:

    • Weight: Controls the similarity between the generated image and the reference image.

    • Mode: Uses high-strength PLUS mode for stronger reference influence.

4️⃣ ControlNet Loader

  • Function: Loads the ControlNet model to control the depth and outline accuracy.

  • Model Name: controlnet-union-sdxl-1.0-promax.safetensors

  • Installation:

    • Place ControlNet models in:

    • ComfyUI/models/controlnet

5️⃣ VAE Loader

  • Function: Loads the VAE model for encoding and decoding images, improving visual quality.

  • Model Name: sd_xl_vae_1.0

  • Installation:

    • Place the VAE model in:

    • ComfyUI/models/vae


πŸ”¨ 4️⃣ Workflow Structure

The workflow consists of the following main groups:

🟑 Group 1: Model Loading and Initialization

  • Components: CheckpointLoaderSimple, VAELoader, LoraLoader

  • Function: Loads the base SDXL model, VAE, and Lora models.

  • Output: The initialized model and CLIP encoder.

🟒 Group 2: IPAdapter and ControlNet Configuration

  • Components: IPAdapterLoader, CLIPVisionLoader, ControlNet Loader

  • Function: Loads the IPAdapter and ControlNet models, applying reference image control and depth accuracy.

  • Output: The conditioned model with reference and depth control.

πŸ”΅ Group 3: Image Processing and Generation

  • Components: KSampler, VAEDecodeTiled

  • Function: Generates the architectural image based on the text prompt and ControlNet parameters.

  • Output: The generated architectural rendering.

🟣 Group 4: Post-processing and Preview

  • Components: ImitationHueNode, Image Comparer (rgthree)

  • Function:

    • Hue imitation for color consistency.

    • Image comparison to evaluate reference and generated image differences.

  • Output: The final architectural render preview.


πŸ” 5️⃣ Inputs and Outputs

βœ… Input Parameters:

  • Prompt: Text description of the architectural style, materials, and scene.

  • Reference Image: Image used as a style guide.

  • Resolution: Default 1024x768.

  • Lora Weight: Controls the intensity of the Lora style influence.

βœ… Output Results:

  • High-resolution architectural rendering.

  • Image comparison views (reference vs. generated image).


⚠️ 6️⃣ Considerations

  • Hardware Requirements:

    • The workflow uses multiple models and ControlNet, requiring a GPU with at least 12GB of VRAM for smooth execution.

  • Generation Time:

    • The use of ControlNet and IPAdapter increases processing time compared to standard SDXL workflows.

  • Model Compatibility:

    • Ensure that the Lora models are compatible with the SDXL version to prevent errors.

  • Optimization Tips:

    • Reducing ControlNet weight can speed up the process.

    • Use Tiled VAE decoding for efficient rendering of large images.