The lora creates an second image that can be combined via image stich (with the orginal) to create an image that can be viewed in VR.
It works well with 3D, 2.5 and photos. 2D images can be an hit or miss, due to them missing in the dataset.
Triggerword: Make stereogram
Around 1700 images (16 Epochs) where used in the dataset, with mixed content.
Works better then my Flux-Kontext-version.
I recommend generating atleast 2 imagepairs, with clip strength of 2 and model of 1.5 (varies for type of image and wanted effect depth)
Lower epochs and workflow can be found on huggingface.
Description
First release for Qwen-Edit-2509
FAQ
Comments (24)
I'm curious if this works better than the Wan 2.2 lora, since Wan 2.2 has really good spatial processing. I'm excited to use this one, but I have to find or make a node that reverses the sides of the images so I can cross my eyes to see it.
Hi, what lora do you mean, alibaba-pai/Wan2.2-Fun-A14B-Control-Camera?
@cihog This one:
https://civitai.com/models/1988265/comfyui-ddd-2d-to-3d-stereoscopic-conversion-and-3d-stereoscopic-generation
I like that it lets you choose either cross-eyed or parallel view, so I can always choose the one that works for me.
@cihog I just tried yours, and it's very good. Surprisingly good result. The other Lora author was going to train a VR180 version, but they never released it. You can see their results on their user page. I wonder how hard it would be for Qwen to handle that.
@Jellai He seems to generate a video with camera movement and takes the last frames of the video. My focus was an keeping the orignal image (left), only one generated image (right) and as much details as possible.
The difference in VR, with an Headset, is alot.
180° Could work with a good dataset (and a new lora), sadly i currently dont have a good one.
@cihog Yes, the movement generation via video is a very smart way to do it, because Wan is so good at handling spatial things, so moving it gives very high quality results in the shapes of the 3D. But as you say, Qwen is definitely better with fine details. Both have great strengths.
yes, that wan2.2 lora is kinda better. you can have an original 360 degree image and make it stereo for a great side-by-side immersion. I don't think you need 180 degree as a separate lora, if your image is 180 degree to begin with. Chroma Flash Heun can generate 360 degree and 180 degree images as flats.
change the image stitching from right to left and then you can just cross your eyes.
this is a good attempt and kiiiinda works. your examples do have some stereo separation when doing the cross eye test but have a lot of elements broken. im actually impressed it works at all. I dont think diffusion models have enough world knowledge to do this with perfect accuracy, but I'm happy to be proven wrong. maybe it just needs more training.
Hi, what parts are broken, reflections/background? It realy depends on the scene, did you try to generate some images?
@cihog i havent used it myself yet. I'm going on your example images posted.
the woman in the bikini - there's a anomaly on her upper hand that stands out. her chest feels concave and the gap in the chair middle bottom looks wrong in 3d.
reflections I'm aware shift in 3d easily so I'm not watching those. more large structures.
the food bowl somewhat works but then elements appear at different depths. this is a trickier scenario though for depth
@cihog the grasshopper one is solid. I can shift my gaze anywhere and things align. the counterstrike on HUD is awesome, clearly in front but the heads of the soldiers appear warped against the bg.
@QualityControl I didnt notice the gap in the first picture . The images are not realy cherry picked. Maybe ill upload some better ones, after my test with wan fails/works.
I think the problem can be fixed with synthetic data in the dataset (alltough generating the data is a pain). Maybe ill do a v. 1.5.
@cihog its already pretty impressive that it works at all. so you've done a good job so far. how do you source your dataset? if would prob be easy to grab frames from any SBS 3d movie, as those are already perfect. split them down the middle for the before and after training data.
@QualityControl I have a camera,some license free images, data i cherry picked from gens and renders from Blender. SBS-Movies sadly can be used (Sometimes the 3D is way to 'flat'), and i dont want to use data i dont 'own'. Im thinking about using an VR mod for UE4/5 Games and HL-Alyx.
@cihog a purist - ok. playing on hard mode. haha
yeah agreed about the flat SBS conversions, they arn't perfect. only Avatar and Titanic 3d are perfect conversions. the rest do cut corners and look like flat cards sometimes.
the only risky part about training on 3d synthetic data might be that it learns the style too? but I guess if it's an edit lora that wont be a problem.
@cihog steam screenshots taken while in VR are stereo. Could work
View them in VR. It works incredibly well, the depth is rich.
I'm going to be spending the next few days cross-eyed I think lol
Works great so far! I generated 30 already. extremely fast and working well with lightx 4step (4070ti super < 30 sek). Also works well with anime.
My experience with this lora:
• Sometimes spamming seeds does the trick.
• 15 % are perfect
• Most look great (70%)
• 15 % cannot cut the cheese.
please keep it up if you have the recources. yours is the best attempt so far. Thank you so much!!!
first of all, thank you for making this. I have noticed that there is a ghosting effect that happens especially if the strength is above 1, often it kinda has a transparent version of the most prominent subjects in the image from the original image left in the new image. I will see if qwen image edit 2511 will solve this problem but if you will ever consider making a 2.0 then that would maybe be something to consider, I don't know if that's solvable or just something that happens at higher strengths but given that you recommend those it's unfortunate
Thanks, could you provide an example image with the workflow (if you use an different workflow)?
I would like to congratulate the creator of this lora. It's an excellent piece of work. I tested it on some movie scenes and got natural 3D quality results.
This is amazing it works really well. How do you guide the image generation to be either left or right? At the moment it seems random. Especially if run on a sequence of images extracted from a video. Sometimes left sometimes right side generated, it will randomly alternate.




