Mastering Style Transfer: A Comprehensive Guide to Image Generation
✅ Workflow Overview

This workflow, titled "Generate Images in the Same Style Based on a Reference Image", aims to generate images that replicate the style, composition, and color tone of a reference image.
Its core functionalities include:
✅ Loading a base model and VAE
🎯 Incorporating LoRA models to refine style
🖼️ Using a reference image for guidance
🔥 Generating new images with the same style
⚙️ Comparing the generated image with the reference
This workflow is particularly useful for style transfer, concept art generation, and generating consistent series of images with a shared aesthetic.
🔥 Core Models
Flux Base Model
Name:
基础算法_F.1
Function: The base model used for generating high-quality images, supporting realistic textures and lighting.
Loading Method: Loaded via the UNETLoader node.
LoRA Model
Name:
锦绣风华F.1_影视级古风人像写实_v1.0
Function: A LoRA model fine-tuned for ancient Chinese-style portraits. This model influences the generated image to reflect a cinematic, classical portrait style.
Weight:
1.0
Loading Method: Loaded using the LoraLoaderModelOnly node.
Florence-2 Model
Name:
microsoft/Florence-2-base
Function: A vision-language model used for image understanding and feature extraction. This model is responsible for interpreting the reference image's style.
Loading Method: Loaded via the Florence2ModelLoader node.
🔧 Node Explanation
Here’s a detailed explanation of the key nodes used in this workflow:
🌟 Base Model Loading
UNETLoader
Function: Loads the base model (
基础算法_F.1
) for image generation.Parameters:
Model:
基础算法_F.1
Precision:
fp8_e4m3fn
Output: Model data for sampling.
VAELoader
Function: Loads the VAE (Variational Autoencoder) for decoding the generated image.
Parameters:
Model:
ae.sft
Output: VAE data used during the encoding/decoding process.
🎯 LoRA Model Integration
LoraLoaderModelOnly
Function: Loads the 锦绣风华F.1 LoRA model to refine the image style.
Parameters:
Model name:
锦绣风华F.1_影视级古风人像写实_v1.0
Weight:
1.0
Output: Enhanced model data used by the sampler.
🖼️ Reference Image Handling
LoadImage
Function: Loads the reference image.
Parameters:
Image file:
0012.jpg
Output: Image data for style extraction.
Florence2ModelLoader
Function: Loads the Florence-2 model for image analysis.
Parameters:
Model name:
microsoft/Florence-2-base
Precision:
fp16
Output: Florence-2 model data.
Florence2Run
Function: Extracts image features and generates captions from the reference image.
Inputs:
Image: Reference image
Florence-2 model
Outputs:
Caption: Automatically generated caption based on the reference image.
Image features: Used for conditioning.
🔥 Sampling and Generation
KSampler
Function: Generates latent images based on the model, conditioning, and noise.
Parameters:
Random seed:
333721078257758
Steps:
30
CFG Scale:
3.5
Sampler type:
euler
Beta schedule:
beta
Denoising strength:
0.68
Output: Latent image used for final rendering.
VAEDecode
Function: Decodes the latent image into the final image.
Inputs:
Latent image
VAE model
Output: Generated image.
🛠️ Image Processing
ImageResizeKJ
Function: Resizes the image to match the reference dimensions.
Parameters:
Width:
1536
Height:
1536
Algorithm:
lanczos
Output: Resized image for processing.
Image Comparer (rgthree)
Function: Compares the generated image with the reference image.
Mode: Side-by-side comparison with a slider.
Output: Comparison view.
🔥 Text Encoding and Conditioning
CLIPTextEncode
Function: Encodes the text prompt into conditioning data for the generation process.
Prompt:
White architectural design, interwoven geometric shapes, minimalist style, white and gray tones, large glass windows, seamless indoor and outdoor spaces, embellished with greenery, marked with "YIYUESJ", the overall atmosphere is fashionable and upscale. Night view effect
Output: Text conditioning for the sampler.
🔍 Workflow Structure
The workflow is divided into several Groups, each serving a specific purpose:
✅ Base Model Loading
Location: Top left
Function: Loads the base model and VAE.
Input: Model and VAE paths
Output: Model data for image generation.
🎯 Reference Image Input
Location: Bottom left
Function: Loads and processes the reference image.
Input: Image file
Output: Image data and extracted features.
🔥 LoRA Model Selection
Location: Center-left
Function: Loads the LoRA model to influence the style.
Input: Base model
Output: Style-enhanced model.
🛠️ Prompt Writing
Location: Center
Function: Encodes the text prompt into conditioning data.
Input: Text prompt
Output: Conditioning data.
🖼️ Image Output
Location: Right
Function: Generates and saves the final image.
Input: Latent image, VAE
Output: Final image file.
🔥 Image Comparison
Location: Top right
Function: Compares the generated image with the reference image.
Input: Two images
Output: Side-by-side comparison.
🔑 Inputs & Outputs
Input Parameters:
Reference image (e.g.,
0012.jpg
)Text prompt
Flux model and LoRA model
Random seed:
333721078257758
Output:
Generated image
Comparison image (reference vs generated)
⚠️ Tips & Considerations
✅ Model Dependencies:
This workflow relies on Florence-2 and LoRA models. Ensure they are properly installed.
⚡ Performance Requirements:
A NVIDIA GPU with at least 12GB VRAM is recommended for smooth performance.
Large images consume more VRAM.
⚠️ Output Variations:
Different random seeds produce different image results.
Adjusting the LoRA weight fine-tunes the style similarity.
🔥 Optimization Tip:
For consistent results, use a fixed seed and consistent image dimensions.