CivArchive
    X-Ray Vision (see through anything) - v1.1 ZIT
    NSFW
    Preview 116287931
    Preview 116288074
    Preview 116288082
    Preview 116288116
    Preview 116288127
    Preview 116288751
    Preview 116288869
    Preview 116288943
    Preview 116288981
    Preview 116289134
    Preview 116289910

    Version 1 series are incremental. Major versions changes are their own entities in this sequence. For instance, v1 was an invisible device, a vortex of sorts and v2 is through a hand-held cell phone. V2 is unlikely to produce any of the specific content available in v1 series.

    The v2 download file includes a 3, which is the major training rendition, and the reason this doesn't match the version is because the training rendition 2 was useless, it produced nothing usable after 20k steps. I had to manually generate new images, again, that were not too disparate from each other in order for it to converge. These manually created images took me 3 days to compose and was a serious PITA, using Illustrious, Z-Image, Qwen edit and Flux 2 Klein in order to help me composite something useful.
    -
    - V1 -

    This LoRA is intended to produce images taken through a device, we're calling it X-Ray, where part of the clothing is missing in order to see undergarments and nudity.

    Some of the trained terms that were used, though they probably don't matter with current architecture..

    X-Ray boobs
    X-Ray bra
    X-Ray panties
    X-Ray pelvis
    X-Ray underwear
    X-Ray nude
    Clothed
    Nude

    I sent it through the trainer up to around 8k steps and for a couple of hours I tried to generate consistent images with some difficult scenarios but the prompts were just not magical enough. I sent it back through for almost 20k steps and, after some testing, I chose around 12k as the compromise between sledge hammer and getting something useful. I know I could have had an easier time, and wasted LESS time, had I just generated more training images but the process wasn't so easy, here's what I did...

    I created some nudes, half with Illustrious and half with Z-Image-Turbo.

    I then brought it over to Qwen Image Edit 2509 and prompted to add clothing, both outer and inner (underwear).

    I took that into an image editor, lately I've gone back to Gimp, getting away from PS, Affinity wars, and just put the dressed versions on top, masking out a this and a that, exposing different sections, saving images of those manipulations.

    I originally intended to make an X-Ray vision person where the eye beams would penetrate the clothing but I hung onto that idea for 2 days and didn't get anywhere, my brain just didn't want to get that complicated, so I just gave in and took a more simpler approach.

    I kept the clothed and nudes in the training data, captioned them all with simple terms and then let JOY-C fiddle with it to give the context some weight.

    Using OneTrainer: I guess the most important factors are defaults, rank 32, alpha 1, 2 repeats and batch size of 1.

    - V2 -

    I grabbed some stock photo items for the cell phone and arms, had Qwen edit extend the arm and add sleeves, I added 3 different sleeves. I used those with the other composited images. It took 3 days to amass 4 slightly different concepts to train with, only 3 of the concepts are repeatable without issues, the shower scene being the most prominent, followed by the hotel window, cruise ship (was kind of a throw away) and then the public restroom scene (which seems to be a complete disaster but can make some funny pictures).

    After adjusting the dataset and running it from 19k to 33k all I got was garbage so I released the less damaged version, which is still rather limited and I'll be trying for some other devices and situations in the future, frankly I'm running out of situational ideas.

    I had to "strong arm" the cell phone into working which prevents the LoRA from producing usable images in any situation beyond its known training set. The most reproducible scenarios is in prominence order...

    Shower Scene - triggers: shower, silhouette, exposed, nude, shower curtain
    Hotel Room - triggers: hotel room, window, on bed, panties, bra, nude
    Cruise Ship - triggers: (mentioning items inside the ship will usually show up in the cell phone)
    Diner - triggers: waitress, nude, breasts, waitress uniform
    Public Restroom - (trained with multiple phones) triggers: 3 men, stall, public restroom

    Conclusion:

    The initial attempt at this concept, applied in v1.0 and v1.1, allowed for the ability to prompt in scenarios that were not offered in the initial dataset. For instance, the pool scenario was completely fabricated after the fact, and attempting to do this with v2.0 fails, with anything I tried, and it attempts to combine trained in material together in order to achieve the goal.

    I can understand that the model is attempting to use the most available material to achieve the goal, which would suggest that I need less steps or a lower weight, but this obvious solution doesn't seem to work as expected with ZIT, though it won't stop me from trying.

    I like the series and I'll add another twist besides the vortex and cell phone, maybe glasses as was suggested in the comments, though I'm very welcome to ideas so please don't hesitate.

    Description

    I sat on v1.1 for awhile, though I did test it during the 1.0 release, and I'm starting to think that this is actually a lot more responsive.

    So, I give you Sledge Hammer, at step 19799.

    FAQ

    LORA
    ZImageTurbo

    Details

    Downloads
    399
    Platform
    CivitAI
    Platform Status
    Available
    Created
    1/4/2026
    Updated
    4/26/2026
    Deleted
    -

    Files