This is a script for converting model differences to LoRA. It should support SD1.5/SDXL/Z-Image. Since SD1.5 and SDXL can be converted to LoRA using SuperMerger, it could practically be considered a tool for Z-Image now (both Base and Turbo versions are supported. However, please do not mix them.). Because it's written using Vive Coding, you should be able to easily adapt it to other models by feeding it to a suitable AI and asking it to "make it compatible with...".
To run it, you need to install torch, safetensors, and packaging in your Python environment. Running it with the -h option will display help.
>ckpt_to_lora.py -h
usage: ckpt_to_lora.py [-h] --base BASE --variant VARIANT [--variant2 VARIANT2] --out OUT [--rank RANK]
[--rank-thresh RANK_THRESH] [--rank-max RANK_MAX] [--min-dim MIN_DIM] [--exclude EXCLUDE]
[--include INCLUDE] [--full-layer] [--union] [--alpha ALPHA] [--checkpoint] [--checkpoint-full]
[--verbose] [--summary] [--detail] [--detail-scale DETAIL_SCALE]
Extract the weight delta between a base model and one or two variant checkpoints (or LoRA files) and save the result as a LoRA or a merged checkpoint.
Single variant: converts (variant - base) directly to LoRA via truncated SVD.
Two variants: extracts the component common to both variants (--variant2), or a weighted union of their unique components (--union).
options:
-h, --help show this help message and exit
--base BASE Base model safetensors path
--variant VARIANT Variant model safetensors or LoRA path
--variant2 VARIANT2 Second variant model or LoRA. Enables dual-variant extraction: common (default) or union
(--union).
--out OUT Output safetensors path
--rank RANK LoRA rank (0 = auto-select, higher = more accurate but larger)
--rank-thresh RANK_THRESH
Explained-variance threshold for auto rank selection (default: 0.95)
--rank-max RANK_MAX Upper bound on auto-selected rank (default: 64)
--min-dim MIN_DIM Skip layers whose smallest dimension is below this value (default: 32)
--exclude EXCLUDE Comma-separated key segments to exclude, e.g. "refiner,embed"
--include INCLUDE Comma-separated key segments to include (all others skipped), e.g. "attn,ff"
--full-layer Also process 1D tensors (Norm weight/bias) stored as .diff; disables min-dim filter
--union With --variant2: output the weighted union of both variants instead of the common
(intersection) component
--alpha ALPHA --union balance: weights are w1=2*alpha, w2=2*(1-alpha), summing to 2. Default 0.5 gives
delta1 + delta2 equally. 0.75 → 1.5*delta1 + 0.5*delta2. Range [0, 1].
--checkpoint Save as checkpoint (base + low-rank-approx delta) instead of LoRA
--checkpoint-full Save as checkpoint (base + full-precision delta, no SVD)
--verbose Print per-layer S stats and show summary at the end
--summary Print grouped summary table at the end (no per-layer detail)
--detail Also save the residual (delta minus first LoRA) as out_detail.safetensors
--detail-scale DETAIL_SCALE
Multiply detail LoRA weights by this factor before saving. Use >1.0 to pre-amplify so the
loader weight can stay near 1.0 (default: 1.0)
While there are many options, it will work if you specify a base model, one derived model, and the output LoRA.
Example:
python ckpt_to_lora.py --base z_image_turbo_bf16.safetensors --variant LucidDreamerZiT_76A3.safetensors --out LD76A3.safetensorsThis will obtain LD76A3.safetensors, which is the LoRA representation of the difference between LucidDreamerZiT_76A3.safetensors and the base model. Using this LoRA with the base model will produce an image that looks similar to LucidDreamerZiT_76A3.safetensors overall (although it's actually not very similar, so this model isn't ideal for this example).
Several parameters affect how many features are picked up. First, there's the LoRA RANK; the larger the RANK, the more features are picked up. By default, it's set to automatic, picking up the difference up to the RANK_THRESH value. The default is 95%. A RANK is set according to the size. However, the maximum rank is controlled by RANK-MAX, and by default, only LoRAs up to 64 are created. Increasing RANK-MAX usually increases the size of the LoRA. While increasing the RANK does capture more detailed features, it also means that many less important features will be picked up. If you want to collect highly pure features, you shouldn't set an excessively large value.
In the case of Z-Image, less than half of all layers are used. As you can see by trying it, this is sufficient, but specifying --full-layer processes all layers. This doesn't necessarily improve the results. To include only specific layers, use --include; to exclude specific layers, use --exclude. Please use these only if you understand their meaning after reading the help.
There are also options to further convert information that LoRA couldn't capture into LoRA data (--detail option), and to obtain the intersection and union of two models.

There is also a function to report the model's status during conversion: --summary and --verbose. --summary displays the module and a general overview of the layers, while --verbose displays the same information plus the status of each layer. This is used to check if the set rank is sufficient, but since it picks up features in order of their characteristics, there is generally no need to raise the rank even if the numbers are overwhelming.
Just because a number is displayed doesn't mean anything. The interpretation changes depending on the situation. A large number in the rank column means that a large number of features are being handled. The strength with which each feature is handled is indicated by S. While it can be said that a larger number indicates better training... this is only partially true. However, there are various cases.
・Summary display in Z-Image
==========================================================================
LoRA Summary
==========================================================================
Level Type n rank S_max S_min ratio energy
--------------------------------------------------------------------------
L1 (blk 0?4) Attn 12 4.2 35.41 12.14 4.1x 96.9%
FF 18 2.3 84.50 39.37 3.3x 96.8%
Mod 6 61.0 0.57 0.28 2.1x 96.2%
--------------------------------------------------------------------------
L2 (blk 5?10) Attn 12 5.6 30.79 9.90 5.1x 96.0%
FF 18 3.3 51.32 16.70 4.3x 96.3%
Mod 6 61.0 0.58 0.27 2.1x 96.2%
--------------------------------------------------------------------------
L3 (blk 11?16) Attn 12 9.7 22.85 3.38 6.8x 95.3%
FF 18 5.6 44.37 6.98 6.9x 95.8%
Mod 6 60.2 0.72 0.28 2.6x 95.4%
--------------------------------------------------------------------------
L4 (blk 17?22) Attn 12 21.5 18.22 2.21 8.1x 95.3%
FF 18 12.8 38.97 4.88 8.2x 95.4%
Mod 6 59.5 0.91 0.26 3.5x 95.5%
--------------------------------------------------------------------------
L5 (blk 23?29) Attn 12 26.2 17.81 2.11 8.7x 95.2%
FF 18 25.8 29.14 3.53 8.6x 95.3%
Mod 6 59.3 0.76 0.29 2.7x 95.6%
--------------------------------------------------------------------------
[ Auxiliary ]-------------------------------------------------------------
cap_embedder Other 1 61.0 0.44 0.28 1.6x 96.0%
--------------------------------------------------------------------------
context_refiner Attn 4 48.0 4.26 0.41 9.1x 95.3%
FF 6 34.2 10.82 0.48 22.5x 95.1%
--------------------------------------------------------------------------
final_layer Mod 1 57.0 0.52 0.10 5.1x 95.1%
Proj 1 60.0 0.23 0.16 1.4x 95.7%
--------------------------------------------------------------------------
noise_refiner Attn 4 51.5 3.36 0.33 9.2x 95.2%
FF 6 37.0 7.75 0.45 17.7x 95.2%
Mod 2 60.0 0.59 0.24 2.5x 95.2%
--------------------------------------------------------------------------
t_embedder FF 2 56.5 0.14 0.03 6.4x 95.6%
--------------------------------------------------------------------------
x_embedder Proj 1 60.0 0.10 0.07 1.5x 95.9%
--------------------------------------------------------------------------
Total ALL 208 23.1 95.8%
==========================================================================
・Summary display SD1.5
==========================================================================
LoRA Summary
==========================================================================
Level Type n rank S_max S_min ratio energy
--------------------------------------------------------------------------
IN-1 Attn 32 52.9 0.13 0.03 4.3x 95.3%
FF 8 56.2 0.27 0.08 3.5x 95.3%
Proj 8 57.9 0.07 0.03 2.6x 95.5%
Other 15 42.7 0.14 0.04 3.8x 95.3%
--------------------------------------------------------------------------
IN-2 Attn 16 55.6 0.24 0.07 3.5x 95.3%
FF 4 57.2 0.53 0.18 2.7x 95.4%
Proj 4 58.0 0.20 0.07 2.9x 95.4%
Other 15 44.7 0.35 0.11 3.7x 95.4%
--------------------------------------------------------------------------
MID Attn 8 55.0 0.25 0.07 3.7x 95.4%
FF 2 56.5 0.59 0.18 3.0x 95.3%
Proj 2 55.5 0.28 0.08 3.6x 95.3%
Other 6 41.0 0.45 0.12 4.3x 95.5%
--------------------------------------------------------------------------
OUT-1 Attn 24 57.4 0.31 0.12 2.8x 95.5%
FF 6 59.0 0.69 0.32 2.0x 95.6%
Proj 6 59.3 0.24 0.12 2.0x 95.4%
Other 26 48.0 0.49 0.19 3.1x 95.6%
--------------------------------------------------------------------------
OUT-2 Attn 48 53.6 0.21 0.06 3.9x 95.3%
FF 12 57.4 0.32 0.16 2.1x 95.4%
Proj 12 57.4 0.12 0.05 2.4x 95.3%
Other 25 48.1 0.25 0.11 3.3x 95.5%
--------------------------------------------------------------------------
OTHER Attn 48 56.0 0.05 0.01 3.5x 95.4%
FF 24 57.4 0.11 0.02 5.0x 95.3%
Other 50 59.1 3.00 1.42 2.0x 95.5%
--------------------------------------------------------------------------
[ Auxiliary ]-------------------------------------------------------------
cond_stage_model Cond 2 59.5 0.03 0.01 1.9x 95.8%
--------------------------------------------------------------------------
first_stage_model Attn 8 58.8 0.97 0.48 1.9x 95.4%
Other 9 56.0 3.75 1.26 2.4x 95.4%
--------------------------------------------------------------------------
time_embed Cond 2 24.5 0.53 0.04 13.5x 95.2%
--------------------------------------------------------------------------
Total ALL 422 54.1 95.4%
==========================================================================
Description
v.1.1 --The union process was not working correctly, so I fixed it.

