Not everyone agrees that size doesn't matter. Would you rather date someone with a paddle board or a yacht?!? but I digress...
Fisting
When a woman loves a woman... they show it to each other with an iron fist! I couldn't find videos showing that though so I settled for regular fists.
I tried training this one with nary more than a key phrase and generalization tokens, which helped me understand a little bit more about how training works.
Minor seed hunting, although it's pretty darn stable and the mutations tend to be slip under the radar at first glance. Only other thing I've seen is the fisting itself not quite happening but this is rare and probably fixable with prompting. I haven't prompt-workshopped this one too hard yet.
Versions
WAN22
The WAN2.2 version has a solid "from behind" modifier you can put right after the key phrase. I don't have it here because it interferes with facial expressions, putting heads on backwards if you prompt both "from behind" and an expression. One of the reasons I've moved to WAN from HunyuanVideo is the lack of mutations. I really don't like mutations! I've seen things... things no man should see lol
Wildcard Prompt
A {Russian|French|German|South Korean|Dutch|Swiss|American|Swedish|Austrian|Ukrainian|Portuguese} woman with {blonde|black|brown|dirty-blonde|dark red} hair is being vaginally fisted by another person {from behind|},
# Don't use from behind AND prompt to see the face...
She is sitting on a {red|blue|yellow|white|black|brown|teal|pink} {chair|couch|bed|dentist chair} {with {large|medium-sized|small} breasts|wearing a {red|blue|yellow|white|black|brown|teal|pink} {shirt|crop top|tank top|bra|jacket|sports bra}} and a shaved vagina.
{the person fisting her is rubbing her clitoris.|}
She is {happy and in love|moaning from pleasure|gritting her teeth and furrowing her brow|closing her eyes with her mouth wide open|extremely surprised and shocked|giggling and laughing|smiling seductively}.
The person fisting her is a {man|woman|doctor|nurse|firefighter} and is thrusting her entire hand back and forth inside her vagina.
{bright|soft} even lighting, {high-angle|} viewThree
Removed the slower clips, so motion is improved. A caption clean up seems to be giving better stability as well. Enjoy!
Two
Blurred faces, 24fps, 50 frames, 5e-5, dropout=0.1 and I can't remember the rest because I trained this weeks ago and only now decided to get showcase materials and release. This one is more solid than version one for sure.
Wildcard Prompt
A {slim|fit|sporty|cute|skinny|beautiful} {Russian|French|Swedish|Swiss|Latina|Austrian|German|Dutch|English|Irish|Portuguese|Indian|Russian|Swiss|Swedish|Danish|Italian} woman reclining with her legs spread as a person on the the left is rhythmically fisting her vagina. The motion is {quick|steady|slow and rotating inside}.
{Wide angle view showing the {male|female|Doctor} partner|An arm extends from the {bottom left|left}}.
She has long {blonde|light blonde|dirty blonde|dark|black|brown|light brown} hair{ in a ponytail| tied back| in pigtails}.
She {has {large|medium-sized|small} breasts|{is wearing a bra|is wearing a nice shirt|is wearing a colorful tube top} covering her nipples}.
#The other person is a {lover|Doctor|friend}. {caressing her chest|wearing a {yellow|red|blue|gold|silver|pink|teal|orange|purple|rainbow|multi-color} {shirt|bra|tube top|tank top}|wearing a {yellow|red|blue|gold|silver|pink|teal|orange|purple} headband}
She is {screaming from pleasure|happy and smiling|surprised|closing her eyes} with her hands {in her hair|on her chest|at her sides|hidden|holding her own legs}, {sitting in a gynecology chair|sitting on a chair|lying back on a bed|sitting on a couch}.
Lighting is {bright|soft} and evenOne
The original attempt, works pretty well but surpassed by Two.
Training Tips
A LORA is basically a subset of a model that rewrites "weights". The weights represent pathways into the model's architecture in terms of how tokens call up visuals. The training process reads your tokens and scans the visuals in order to nudge the weights in such a way that they more closely reproduce what your training material shows it.
If you never caption for hair color. No weights involving hair or color will be written into the LORA, and as such during training, you don't get access to those weights and thus they either get baked in, or reduce your ability to prompt for that kind of thing in the base model.
If all your models are fat, but you never caption for body type, you're not going to be able to prompt for fit woman, they'll all be fat. The LORA will assume that's what the concept is, fat women, and no weights related to body type will be touched or touchable. So... you caption 'fat woman' just a couple times to get those weights involved in training and thus involved in prompting. That way, the LORA/Model pair have something to work with.
If you don't plan on prompting for things you see, you can generally not mention them at all, but then they might show up unintentionally. This is why a varied dataset is important, because if visually a few videos show the exact same model or background, the training process will lock onto those features and you'll have trouble escaping them.
If your LORA involves poses, you want a dataset that does not deviate much, otherwise you get body horrors and mutations, no amount of captioning is going to let a small file that modifies a massive file fix anatomy.
Disclaimer
Apply lots of lube and go slow.