Fisting - WAN22-HIGH

NSFW

Not everyone agrees that size doesn't matter. Would you rather date someone with a paddle board or a yacht?!? but I digress...

Fisting

When a woman loves a woman... they show it to each other with an iron fist! I couldn't find videos showing that though so I settled for regular fists.

I tried training this one with nary more than a key phrase and generalization tokens, which helped me understand a little bit more about how training works.

Minor seed hunting, although it's pretty darn stable and the mutations tend to be slip under the radar at first glance. Only other thing I've seen is the fisting itself not quite happening but this is rare and probably fixable with prompting. I haven't prompt-workshopped this one too hard yet.

Versions

WAN22

The WAN2.2 version has a solid "from behind" modifier you can put right after the key phrase. I don't have it here because it interferes with facial expressions, putting heads on backwards if you prompt both "from behind" and an expression. One of the reasons I've moved to WAN from HunyuanVideo is the lack of mutations. I really don't like mutations! I've seen things... things no man should see lol

Wildcard Prompt

A {Russian|French|German|South Korean|Dutch|Swiss|American|Swedish|Austrian|Ukrainian|Portuguese} woman with {blonde|black|brown|dirty-blonde|dark red} hair is being vaginally fisted by another person {from behind|}, 
# Don't use from behind AND prompt to see the face...

She is sitting on a {red|blue|yellow|white|black|brown|teal|pink} {chair|couch|bed|dentist chair} {with {large|medium-sized|small} breasts|wearing a {red|blue|yellow|white|black|brown|teal|pink} {shirt|crop top|tank top|bra|jacket|sports bra}} and a shaved vagina.
{the person fisting her is rubbing her clitoris.|}

She is {happy and in love|moaning from pleasure|gritting her teeth and furrowing her brow|closing her eyes with her mouth wide open|extremely surprised and shocked|giggling and laughing|smiling seductively}.

The person fisting her is a {man|woman|doctor|nurse|firefighter} and is thrusting her entire hand back and forth inside her vagina.

{bright|soft} even lighting, {high-angle|} view

Three

Removed the slower clips, so motion is improved. A caption clean up seems to be giving better stability as well. Enjoy!

Two

Blurred faces, 24fps, 50 frames, 5e-5, dropout=0.1 and I can't remember the rest because I trained this weeks ago and only now decided to get showcase materials and release. This one is more solid than version one for sure.

Wildcard Prompt

A {slim|fit|sporty|cute|skinny|beautiful} {Russian|French|Swedish|Swiss|Latina|Austrian|German|Dutch|English|Irish|Portuguese|Indian|Russian|Swiss|Swedish|Danish|Italian} woman reclining with her legs spread as a person on the the left is rhythmically fisting her vagina. The motion is {quick|steady|slow and rotating inside}.

{Wide angle view showing the {male|female|Doctor} partner|An arm extends from the {bottom left|left}}.

She has long {blonde|light blonde|dirty blonde|dark|black|brown|light brown} hair{ in a ponytail| tied back| in pigtails}.

She {has {large|medium-sized|small} breasts|{is wearing a bra|is wearing a nice shirt|is wearing a colorful tube top} covering her nipples}.

#The other person is a {lover|Doctor|friend}. {caressing her chest|wearing a {yellow|red|blue|gold|silver|pink|teal|orange|purple|rainbow|multi-color} {shirt|bra|tube top|tank top}|wearing a {yellow|red|blue|gold|silver|pink|teal|orange|purple} headband}

She is {screaming from pleasure|happy and smiling|surprised|closing her eyes} with her hands {in her hair|on her chest|at her sides|hidden|holding her own legs}, {sitting in a gynecology chair|sitting on a chair|lying back on a bed|sitting on a couch}.

Lighting is {bright|soft} and even

One

The original attempt, works pretty well but surpassed by Two.

Training Tips

A LORA is basically a subset of a model that rewrites "weights". The weights represent pathways into the model's architecture in terms of how tokens call up visuals. The training process reads your tokens and scans the visuals in order to nudge the weights in such a way that they more closely reproduce what your training material shows it.

If you never caption for hair color. No weights involving hair or color will be written into the LORA, and as such during training, you don't get access to those weights and thus they either get baked in, or reduce your ability to prompt for that kind of thing in the base model.

If all your models are fat, but you never caption for body type, you're not going to be able to prompt for fit woman, they'll all be fat. The LORA will assume that's what the concept is, fat women, and no weights related to body type will be touched or touchable. So... you caption 'fat woman' just a couple times to get those weights involved in training and thus involved in prompting. That way, the LORA/Model pair have something to work with.

If you don't plan on prompting for things you see, you can generally not mention them at all, but then they might show up unintentionally. This is why a varied dataset is important, because if visually a few videos show the exact same model or background, the training process will lock onto those features and you'll have trouble escaping them.

If your LORA involves poses, you want a dataset that does not deviate much, otherwise you get body horrors and mutations, no amount of captioning is going to let a small file that modifies a massive file fix anatomy.

Disclaimer

Apply lots of lube and go slow.

Description

FAQ

Comments (5)

GlowingGuardianGirlMar 10, 2026· 2 reactions

CivitAI

*Where's my watch* moment

az420

Author

Mar 10, 2026

keep diggin' you'll find it!

playtime_ai_Mar 11, 2026· 5 reactions

CivitAI

Your captioning tip is 100% correct and it is also the basic element of captioning for clip based models.... Caption the things that you want the ability to change. Everything else will get baked into the model.

If your model involves multiple poses or positions, though... All you have to do is properly and consistently caption them to avoid body horrors. It's basically the opposite theory. When you have a lot of variety, prompt the different varieties in a consistent way and you can absolutely train a model like this one with multiple positions and camera angles.

chestholeMar 17, 2026

How can you see his captioning?

playtime_ai_Mar 17, 2026

@chesthole I read the model description. They talk about it.

LORA

Wan Video 2.2 T2V-A14B

by az420

Download (Beta) View on CivitAI