This LoRA is trained to generate images in the style of device bondage. It includes devices such as fucking machine, vibrator, dildo stick and also pipes, ropes, chains, belts, etc. It works best with models that already have a strong intuition of NSFW content as well as human anatomy.
As the data is labeled somewhat detailed, the concept of device bondage can be transfered to other bondage or restraining techniques by restricting the prompt with the desired setting and removing e.g. the dark wood background.
Especially the Flux version is very experimental. Please read the verison info.
Description
The Flux version is very experimental. It messes up anatomy at least 3/4 of the time (I got SD3 flashbacks) and the quality of the results is not on Flux level most of the time and more like SDXL or Pony. I still wanted to share this version for people to experiment with. I hope that future checkpoints of Flux can introduce both a better understanding of NSFW and human anatomy, so I can build my LoRA ontop of them.
I used the same 209 images dataset that I used for the pony and V3.0 SDXL version.
FAQ
Comments (8)
FLUX version needs some work my dude. .001 learning rate is wild high for something like this. maybe try throwing another 0 in there, .0001 or .0002. you'll have to increase steps, especially w/ your image count and complexity of concept. This is just a theory of mine, but especially if you're not using a 1:1 aspect ratio on all images you might try batch size 1. with buckets, i think it pulls from different buckets when batch size >1 and lowers quality. Also, how are you tagging? FLUX likes sentences much better than tags, can't use same tags you would for SD. For anatomy, if your dataset has good anatomy it should pull in. Check my Not Another Facial LoRA, it does nudity great when using base FLUX because i trained it in with the LoRA. Just some thoughts, would love to have a good BDSM LoRA for FLUX!
You are absolutely right. That's why I tried to add as much disclaimers, about the experimental state that the Flux version is in, as possible. The Images that I posed are cherrypicked out of ~50-100 generated images and I would advice anyone against testing it on CivitAI to save the buzz and only experiment on local hardware.
That being said, thank you for the suggestions on parameters. I tagged my dataset with BooruTagManager for pony and therefore all of the captions are basically just tags. As you mentioned that Flux likes sentences much better, do you know any good uncensored VLM that I can use to transform my tags into sentences without them being cencored or wrongly included into the textual description? I tried Florence2 and some other big-company VLMs but they were really bad at NSFW, let alone bondage. I would prefer not to caption them by hand again for 209 images :'D
Also I cropped them all to 1024x1024 so bucketing is not required. And I used a learning rate of 1e-3, so I might experiment with 1e-4 in the future. But do you think the issue with wrong understanding of anatomy will be fixed by just throwing more steps into the training process? Or will this just be a waste of compute until an nsfw-finetuned base model is available?
@tutakanbeity i'm coming back to the top of this message because it became an article, I'm going to use bullet points.
1. Steps and LR: When I said you'd have to increase your steps I just meant that IF you were to lower the learning rate, you would have to increase the steps (all other parameters being the same besides LR) to get the same amount of "training" done. For even basic faces I would use no higher than 4e-4, and you are trying to get an entire fetish at 1e-3.
2. Flux Captioning: FLUX is... special. My non-professional opinion is to try NOT captioning in the traditional sense. Pick a phrase that encapsulates what you're trying to achieve with the LoRA so that it technically applies to every image in your dataset, and put that same caption on all of them. I'll use my Not Another Facial LoRA as an example. Mines a little more simple as a concept, but I think it will apply. I used "cum on her face" as the thing i was trying to encapsulate as a concept, and used that for every image. Keep in mind FLUX still has the entirety of the t5 text "database" to use so it's incredibly flexible. Now when I use "Nude slender woman with small breasts and cum on her face riding a motorcycle, wearing a helment, with her hands up in the air" as a prompt it still produces that in the image correctly (small breasts included) on base FLUX even though base FLUX can't do "nude woman" worth a sh*t. It still "understands" what it's being shown even if you don't tell it with tags, and still "understands" what it's being asked to do, even if it doesn't know how to do it.
3. Tagging: IMO using Booru hurt you as much or more than your LR. Pony (SDXL) are like oil and water when it comes to language. Tags and FLUX don't really mix. FLUX understands how we speak, to a degree, and is designed to produce what we're asking for in normal language. It would be like asking a regular person to paint a picture using Booru tags. They would just look at you funny. If you're still getting issues with anatomy after moving to no tags, I think manual captioning using sentences would be worth your time, which leads me to my next point:
4: Time Economy: I would trade the time spent getting perfect 1024x1024 for lowering your batch size to 1 and bucketing, or manual captioning as described in point 3. I'm less confident on bucketing than I am on captioning and LR, but i wanted to mention it. This all is depending on your training environment of course, as batch size 1 has its own drawbacks depending on your hardware. However, I've moved almost exclusively to batch size 1 recently and haven't had to retrain a model since.
Another thing you might consider is doing a poor man's merge with comfy to try and get better anatomy. Full disclosure i've never tried this, but i haven't had a good use case yet. I just wanted you to know this is available. What you're trying to do would be a perfect use case IMO. Check out this reddit article https://www.reddit.com/r/StableDiffusion/comments/1ex63x7/easy_merge_of_flux_lora_into_a_unet_merge/. Couple of things to try, first would be to merge something like FLUXTASTIC v3 LoRA (shouts @PromptoAI) with base FLUX and train your lora with the resulting UNet. That would guarantee your LoRAs understanding of anatomy. Additionally you could train on base and merge your LoRA with something like FLUXTASTIC (given that you have permission, I think you can check the LoRA page to determine that). Either of those might yield the results you're looking for.
Just my 2 cents, let me know what you think :)
@TipsyThrowaway14 Thank you very much for the detailed response! I will definitely look into the suggested adjustments and once I receive more batch time from my HPC operator, I'll start to experiment with another training run :)
@tutakanbeity Joy_caption,This is currently a good and uncensored natural language marking program. I hope it can be helpful on your training journey
Flux loosing correct anatomy (details like finger, hands and feet but even big stuff like arms and legs) is an issue with LoRA training.
I was also bitten by it but retraining with other training setting then worked.
So I'd have a look at the hyper parameters whether that's an issue.
And probably also the regularization could help to get better results.
Thank you for the input :)
Do you know what hyperparameters might reduce the catastrophic interference? Or what other training settings improved your retraining? Also what type regularization images do you recommend? For people it's quite easy to use other men/women. But for a concept like bdsm I'm a bit lost.
@tutakanbeity the hyperparameters to check is basically the learning rate, directly or indirectly like cosine, cosine with restart, ...
For character LoRA I'm trying to auto caption my training images and create with Flux images for that auto caption and use that as regularization. And then manually change the auto caption to the real caption of the training images. This has worked but I have no research (and especially not scientific one) whether that's the best or at least a good approach. But as I wrote: it worked.
For a concept LoRA I'm with you that this probably doesn't work as it is. But you might adapt it. And / or add random good images of humans and their anatomy and throw that in as regularization.













