RefControl: Flux Kontext Reference+Depth Fusion LoRA - v1.0

Depth Reference Fusion LoRA

📝 Short description

A LoRA for Flux Kontext Dev that fuses a reference image (left) with a depth map (right).
It preserves identity and style from the reference while following the pose and structure from the depth map.

Trigger word: redepthkontext

Demo Video

Example 2

Example 3

📖 Extended description

This LoRA was primarily trained on humans, but it also works with objects.
Its main purpose is to preserve identity — facial features, clothing, or object characteristics — from the reference image, while adapting them to the pose and composition defined by the depth map.

⚙️ How to use

Concatenate two images side by side:
- Left: reference image (person or object)
- Right: depth map (grayscale or silhouette)
Add the trigger word redepthkontext in your prompt.

✅ Example prompt

redepthkontext change depth map to photo

🎯 What it does

Preserves character or object identity across generations.
Embeds the subject into the new pose/scene defined by the depth map.
Works best when the depth map has similar proportions and sizes to the reference.

⚡ Tips

Works better if the depth map is not drastically different in object scale.
Can be combined with text prompts for additional background/environment control.

📌 Use cases

Human portraits in different poses.
Consistent character design across multiple scenes.
Object transformations (cars, furniture, props) with depth-guided placement.
Storyboarding, comics, or animation frame generation.

Description

FAQ

Comments (39)

orzechowy3334318Aug 17, 2025

CivitAI

It works great! Thank you!

thedeoxen

Author

Aug 17, 2025

Glad to hear, thank you! 🙌

zml_wAug 17, 2025

CivitAI

有趣的想法！

thedeoxen

Author

Aug 17, 2025

谢谢!

tsunamixAug 17, 2025· 1 reaction

CivitAI

The only catch is that when I load two 1024x1024 images, it stitches them and resizes 2048x1024 to 1456x720, so in the end we get a modified 720x720 image, not the 1024x1024 image.

thedeoxen

Author

Aug 17, 2025

Yeah, this is limitation of flux, and hard to fix it :(

Cropping reference object to vertical aspect ratio, can help a bit, if it is possible for your case.

zml_wAug 17, 2025

这个问题很好解决，使用没有经过FluxKontextImageScale节点的2048*1024的图输入到latent接口上即可！正面条件是经过FluxKontextImageScale和ReferenceLatent的就行了。

This problem is very easy to solve. Just input the 2048*1024 graph that has not passed through the FluxKontextImageScale node into the latent interface! The positive condition only needs to be verified through FluxKontextImageScale and Reference client.

kaptainkoryAug 17, 2025

Does the LoRA not work by just chaining together two ReferenceLatent nodes, without stitching?

zml_wAug 17, 2025

kaptainkory 估计不行，它里面是含有latent的，合并后信息就乱了。

I guess it won't work. It contains latent information. After merging, the information will be in a mess.

thedeoxen

Author

Aug 17, 2025

zml_w
謝謝你的評論！很好的建議，你說得對，確實可以，雖然效果可能不如推薦的解析度。但值得一試！

thank you for your comment! Good advice, you right it can work, maybe not that good as with recomended resolution. But it worth to try!

thedeoxen

Author

Aug 17, 2025

kaptainkory Lora trained that way that it require to have both images (reference and depth) in the latent that feed into sampler. So probably won't.

orzechowy3334318Aug 17, 2025· 1 reaction

Let say i would like to create 1280x832 image using the lora
My flow to this lora is:

1) Manually prepare subject image 768x768 (for example: standing girl and bicycle)
2) Manually prepare action image 768x768 (for example: girl riding a bike)
-- The rest is in Comfy workflow
3) Extract depth map from action image
4) Stitch subject image and depth image
5) Run Kontext
6) split output to separated images, take right half
7) prepare black image 1280x832
8) resize splited output to minor destination size (Resize to shortest) 768x768 -> 832x832
9) blend splited output on black image in the center
10) once again run kontext but without lora with prompt: fill black matching to the image in the center (you can improve the prompt)
-- End kontext flow, but...
Because Kontext is pretty shitty itself I copy (clipspace) my kontext image as input to QWEN img2img with strength denoise 0.3 res_multisetp sampler.
-- End flow. I've got pretty image with subjects and action i want.

tsunamixAug 18, 2025

zml_w I'm not sure I got the last sentence right, but I tried inputting a stitched image 1.5x larger than recommended. The result wasn't good.

thedeoxen

Author

Aug 18, 2025· 1 reaction

kaptainkory I checked more, right now it don't work this way, but i defenetly should retrain it to support this usage. Thank you for the idea!

civitaimasterAug 18, 2025

CivitAI

Hello, it is really nice! May I ask you how do you train such a model? How does one entry in the dataset looks like? And how big was the dataset? Thank you!

thedeoxen

Author

Aug 19, 2025· 2 reactions

hey.
Sorry, I am planning to add some more LoRAs with a similar pipeline, so I’m not sharing the full dataset preparation details right now. Once I finish this series of LoRAs, I may release the pipeline publicly so that others can reproduce and adapt it

LastDelivery4801226Aug 19, 2025

thedeoxen are you thinking to make similar for lineart controlnet too?,that would be great.Thank you ❤️❤️❤️

thedeoxen

Author

Aug 19, 2025· 2 reactions

LastDelivery4801226 yep, line art, poses, canny edges planned :)

TurboCoomerAug 21, 2025

@thedeoxen my hero

thedeoxen

Author

Aug 27, 2025

@LastDelivery4801226

@LastDelivery4801226 Hey, just released lineart 🙌 Hope it will be helpfull for you
https://civitai.com/models/1902256?modelVersionId=2153190

SpicyBarnesAug 19, 2025· 1 reaction

CivitAI

I seem to be the only person in the world that can't get this to work. Where is this depth extract node? I can't find it online or in the comfy manager. I also getting an error on the workflow. Anybody can help me would be great.

thedeoxen

Author

Aug 19, 2025

hi, I am sory that workflow didn't work for you
I grouped midas model loader and midas depth approximation from this nodes
https://github.com/WASasquatch/was-node-suite-comfyui

but actually it can be any of depth extraction

TurboCoomerAug 21, 2025

use depthcrafter or depth anything

spiritformAug 19, 2025· 3 reactions

CivitAI

a workflow would be great, since the aspect ratio issues are not too clear ... would like a single image 1:1 with the depth map

benlaudAug 21, 2025· 1 reaction

I just share an example workflow . You may try it. p.s The ratio don't need to be 1:1

https://civitai.com/models/1887029/depth-reference-fusion-lora-example-workflow?modelVersionId=2135933

KenchailaAug 22, 2025· 1 reaction

CivitAI

It seems to work decently well with anime/digital art style with the few tests I've done - not nearly as good as with realistic images though. Could something like this be doable with lineart/canny too?

thedeoxen

Author

Aug 22, 2025· 1 reaction

Hey, thank you. Next week i am going to publish lineart and canny. Maybe also poses.
I'll ping you when it will be available :)

thedeoxen

Author

Aug 27, 2025

@Kenchaila
Hey, just released lineart 🙌 Hope it will be helpfull for you
https://civitai.com/models/1902256?modelVersionId=2153190

KenchailaSep 7, 2025

@thedeoxen Thanks for the heads up! Gonna test it out soon :)

omidconnect205Aug 27, 2025

CivitAI

Error:

GroundingDinoSAM2Segment (segment anything2)

Failed to get source for <function Enum._generate_next_value_ at 0x0000021869CE0AE0> using inspect.getsource

melvinmickyAug 27, 2025

CivitAI

Really like it so far, but i cant get to the same quality of the referance images, are u using like an absurdly big resolution or something urrently trying with around 1568 width wich makes the final output 784x672 which is not a great resolution and the model seems to do bad on smaller details.

linnkolnSep 2, 2025

CivitAI

Was anybody able to make it work on ForgeUI?

cesa210Oct 7, 2025· 3 reactions

CivitAI

ATTENTION Comfy UI users

There is an easier, faster and better looking way to use this LoRA.

Instead of using the Stitch node that is terrible wasteful in resources and most of the time the final result is less than ideal.

You should use the Reference Latent nodes, just daisy chain two of this bad boys then connect in the first one the depth image and in the second one the ref image.

Finally the latent connector for the ksampler, or similar, use a Empty Latent node to define the dimensions.

Maybe i will post later a workflow that uses this technique.

artsawfreelance502Nov 7, 2025

CivitAI

Hey, that works great for me. Surprised how accurate the results. Any tips for training? I want to train something similar for my self

thedeoxen

Author

Nov 13, 2025· 1 reaction

Hey, sorry for the delay. I opensourced datasets that i prepared for training.
https://huggingface.co/datasets/thedeoxen/refcontrol-flux-kontext-dataset
Some details about training you can find here
https://huggingface.co/thedeoxen/refcontrol-flux-kontext-reference-pose-lora/discussions/1#68addfb59ae343ad624e1ca7

artsawfreelance502Nov 14, 2025

@thedeoxen Oh, thanks man

altoiddealerJan 29, 2026· 1 reaction

CivitAI

It's crazy to me that with the advent of Qwen Edit and Flux Klein, this is still the most powerful method to edit an image via depth map. Those models already have some fundamental concept when passing a depth map as input but the results are no where near as good as Kontext + your LoRA.

Do you have any plans to retrain on any of the newer architectures?

thedeoxen

Author

Feb 2, 2026· 4 reactions

Thank you, I am on it, there is some difficulties with them, but hope that release it soon.

altoiddealerFeb 2, 2026

@thedeoxen Amazing news! I'll be on the lookout

LORA

Flux.1 Kontext

by thedeoxen

Download (Beta) View on CivitAI

depth

concept