cpp

OpenVINO Stable Diffusion (with LoRA) C++ pipeline

The pure C++ text-to-image pipeline, driven by the OpenVINO native API for Stable Diffusion v1.5 with LMS Discrete Scheduler, supports both static and dynamic model inference. It includes advanced features like LoRA integration with safetensors and OpenVINO extension for tokenizers. This demo has been tested on Windows and Linux platform.

Note

This tutorial assumes that the current working directory is <openvino.genai repo>/image_generation/stable_diffusion_1_5/cpp/ and all paths are relative to this folder.

Step 1: Prepare build environment

C++ Packages:

CMake: Cross-platform build tool
OpenVINO: Model inference
Eigen3: Lora enabling

Prepare a python environment and install dependencies:

conda create -n openvino_sd_cpp python==3.10
conda activate openvino_sd_cpp
conda install openvino eigen c-compiler cxx-compiler make

Step 2: Convert Stable Diffusion v1.5 and Tokenizer models

Stable Diffusion v1.5 model:

Install dependencies to import models from HuggingFace:

conda activate openvino_sd_cpp
python -m pip install -r scripts/requirements.txt
python -m pip install ../../../thirdparty/openvino_contrib/modules/custom_operations/[transformers]

Download a huggingface SD v1.5 model like:

runwayml/stable-diffusion-v1-5
dreamlike-anime-1.0 to run Stable Diffusion with LoRA adapters.

Example command:

huggingface-cli download --resume-download --local-dir-use-symlinks False dreamlike-art/dreamlike-anime-1.0 --local-dir models/dreamlike-anime-1.0

Please, refer to the official website for model downloading to read more details.

Run model conversion script to convert PyTorch model to OpenVINO IR via optimum-intel. Please, use the script scripts/convert_model.py to convert the model into FP16_static or FP16_dyn, which will be saved into the models folder:

cd scripts
python convert_model.py -b 1 -t FP16 -sd ../models/dreamlike-anime-1.0 # to convert to models with static shapes
python convert_model.py -b 1 -t FP16 -sd ../models/dreamlike-anime-1.0 -dyn True # to keep models with dynamic shapes
python convert_model.py -b 1 -t INT8 -sd ../models/dreamlike-anime-1.0 -dyn True # to compress the models to INT8

Note

Now the pipeline support batch size = 1 only, i.e. static model (1, 3, 512, 512)

LoRA enabling with safetensors

Refer to python pipeline blog. The safetensor model is loaded via safetensors.h. The layer name and weight are modified with Eigen Lib and inserted into the SD model with ov::pass::MatcherPass in the file common/diffusers/src/lora.cpp.

SD model dreamlike-anime-1.0 and Lora soulcard are tested in this pipeline.

Download and put safetensors and model IR into the models folder.

Step 3: Build the SD application

conda activate openvino_sd_cpp
cmake -DCMAKE_BUILD_TYPE=Release -S . -B build
cmake --build build --parallel

Step 4: Run Pipeline

./stable_diffusion [-p <posPrompt>] [-n <negPrompt>] [-s <seed>] [--height <output image>] [--width <output image>] [-d <device>] [-r <readNPLatent>] [-l <lora.safetensors>] [-a <alpha>] [-h <help>] [-m <modelPath>] [-t <modelType>]

Usage:
  stable_diffusion [OPTION...]

-p, --posPrompt arg Initial positive prompt for SD (default: cyberpunk cityscape like Tokyo New York with tall buildings at dusk golden hour cinematic lighting)
-n, --negPrompt arg Default is empty with space (default: )
-d, --device arg AUTO, CPU, or GPU (default: CPU)
--step arg Number of diffusion step ( default: 20)
-s, --seed arg Number of random seed to generate latent (default: 42)
--num arg Number of image output(default: 1)
--height arg Height of output image (default: 512)
--width arg Width of output image (default: 512)
-c, --useCache Use model caching
-r, --readNPLatent Read numpy generated latents from file
-m, --modelPath arg Specify path of SD model IR (default: ../models/dreamlike-anime-1.0)
-t, --type arg Specify the type of SD model IR (FP16_static or FP16_dyn) (default: FP16_static)
-l, --loraPath arg Specify path of lora file. (*.safetensors). (default: )
-a, --alpha arg alpha for lora (default: 0.75)
-h, --help Print usage

Examples

Positive prompt: cyberpunk cityscape like Tokyo New York with tall buildings at dusk golden hour cinematic lighting

Negative prompt: (empty, here couldn't use OV tokenizer, check the issues for details)

Read the numpy latent instead of C++ std lib for the alignment with Python pipeline

Generate image without lora ./stable_diffusion -r
Generate image with soulcard lora ./stable_diffusion -r
Generate different size image with dynamic model (C++ lib generated latent): ./stable_diffusion -m ../models/dreamlike-anime-1.0 -t FP16_dyn --height 448 --width 704

Notes:

For the generation quality, be careful with the negative prompt and random latent generation. C++ random generation with MT19937 results is differ from numpy.random.randn(). Hence, please use -r, --readNPLatent for the alignment with Python (this latent file is for output image 512X512 only)

Name		Name	Last commit message	Last commit date
parent directory ..
scripts		scripts
src		src
704x448.bmp		704x448.bmp
CMakeLists.txt		CMakeLists.txt
README.md		README.md
set_up_and_run.sh		set_up_and_run.sh
soulcard_lora.bmp		soulcard_lora.bmp
without_lora.bmp		without_lora.bmp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpp

cpp

README.md

OpenVINO Stable Diffusion (with LoRA) C++ pipeline

Step 1: Prepare build environment

Step 2: Convert Stable Diffusion v1.5 and Tokenizer models

Stable Diffusion v1.5 model:

LoRA enabling with safetensors

Step 3: Build the SD application

Step 4: Run Pipeline

Examples

Notes:

Files

cpp

Directory actions

More options

Directory actions

More options

Latest commit

History

cpp

Folders and files

parent directory

README.md

OpenVINO Stable Diffusion (with LoRA) C++ pipeline

Step 1: Prepare build environment

Step 2: Convert Stable Diffusion v1.5 and Tokenizer models

Stable Diffusion v1.5 model:

LoRA enabling with safetensors

Step 3: Build the SD application

Step 4: Run Pipeline

Examples

Notes: