Deepthroat, Blowjob - Wan 2.X I2V & T2V

Deepthroat, Blowjob - Wan 2.X I2V & T2V - 2.1 - v1.0 I2V

NSFW

Wan 2.2

Recommended LoRa weight: 1
See version details for other parameters.

I2V

This version uses the same datasets as the 2.1 version, with added datasets about penises.

I don't think the model is perfect yet, but it seems improved compared to the 2.1 version:

Better generalisation (in my opinion)

Better saliva supports (though not on hardcore scenes, I wish to train a version for this)
Way better subject creation (adding a penis, a girl, a man to an image). You shouldn't have weird heads holding an half-penis in the mouth appearing now... At least not much.
Penises behaves now more naturally when handled (they were often too static to my taste)

Known issues:

Nude woman added in the video will very probably have a penis, reducing Low Noise weight could works sometimes, but I'll need to retrain it to differentiate penises & vajayjay (it's an edge case though, so not that important)

T2V

This version have an improved datasets compared to I2V, with added datasets about vulvas so we can infer video without supplementary LoRas (and avoid woman with penises like in the I2V).

It's an imperfect model to be honest, side views are great, but other use cases can be picky on the prompt (impossible position won't have good amplitude). I can't spend more time on it for the moment, so let's hope it's enough!
Though, most of the concept is here, I'm curious to see how it behaves with your prompts, a v2 will maybe be needed depending on your feedback.

Known issues:

With Kijai's 4-Step Lightning LoRa, the Low Noise model can be pretty opinionated on the woman look, and difficult to control. I recommend to not use the Lightning LoRa on low noise in this kind of use cases.
In use cases where the woman position is a bit awkward, where it doesn't seems easy to suck on a penis or the prompt is a bit too precise & cluttered, the amplitude won't be very good. It can be forced with prompt like she leans on the penis and increased weight on the High Noise LoRa, but be careful on prompting.
Prompting need to be explicit since prompt adherence is pretty good, the LoRa won't do the work for you.
Penises are not always perfect, and you could need a side LoRa to improve their look.
Anime doesn't have a great support (well Wan is not good at it either so...)

Low Noise ? High Noise ?

The High Noise LoRa contains most of the deepthroat movement, and basic details. The Low Noise refines the movement and improves overall details of the action (penis, testicules, penis/mouth interaction, saliva, micro-movements, etc).
It means that the Low Noise LoRa is not mandatory, especially in use cases where the deepthroat is not done on a penis. So don't forget to try without it!

Lightning LoRa ?

This LoRa behaves very well with Kijai's 4-Step Lightning LoRa (CFG: 1/2, Shift: 7/8)
I even consider results better with it... But that's maybe because I'm not so good at inference on raw Wan 2.2 yet.

All in all, please let me know of any weird quirks about this version, I've tested it quite a lot, but who knows.

Enjoy !
(Don't use this to do illegal sh*ts, I condemn them firmly)

Trigger words (at the start of the prompt):

blowjob, deepthroat;

Helping trigger words

For POV (not mandatory but can help)

blowjob, deepthroat, pov;

For upside down deepthroat (not mandatory but can help):

blowjob, deepthroat, reverse;

For on lap deepthroat (can help, but this use case is under trained so I can't guarantee a good inference)

blowjob, deepthroat, on-lap;

For 69 (not mandatory, but can help):

blowjob, deepthroat, sixnine;

About penises, you can have moderate control on it with those keywords:

penis, dick, erect, flaccid, the foreskin is pulled back, the foreskin is pulled up, glans is visible, glans is hidden, testicules, ...

Helping phrases:

a video on a woman performing a deepthroat blowjob. [...]

A man is moving his penis back and forth in the woman mouth and throat

[...] the penis is going deep into her mouth and throat

The woman is bobbing her head back and forth while sucking the penis

[...] The woman makes the penis wet with translucent saliva 
[...] a bit of foamy translucent saliva accumulates on the penis

(careful with the foamy saliva, it can get weird quick)

To help trigger throat bulging (still unstable, couldn't make it works correctly yet):

Her throat is (prominently|slightly) bulging as the man's penis penetrates her mouth and throat

Example prompt:

blowjob, deepthroat; a video on a woman with fair skin performing a deepthroat blowjob. She kneeling in front of a man. The man is moving his erect penis back and forth in the woman mouth and throat, while she is sucking the man's penis, the penis is going deep into her mouth and throat. Her throat is prominently bulging as the man's penis penetrates her mouth and throat. She makes the penis wet with translucent saliva.

Wan 2.1

Recommended LoRa weight (480p): 1

The weight can be reduced to limit amplitude, or limit weird behaviours. But 1 should be fine most of the time.

Can be used alone.
Enjoy!

Trigger words (at the start of the prompt):

blowjob, deepthroat;

Helping trigger words

For POV (not mandatory but can help)

blowjob, deepthroat, pov;

For upside down deepthroat (not mandatory but can help):

blowjob, deepthroat, reverse;

For on lap deepthroat (can help, but this use case is under trained so I can't guarantee a good inference)

blowjob, deepthroat, on-lap;

For 69 (not mandatory, but can help):

blowjob, deepthroat, sixnine;

Helping phrases:

a video on a woman performing a deepthroat blowjob. [...]

A man is moving his penis back and forth in the woman mouth and throat

[...] the penis is going deep into her mouth and throat

The woman is bobbing her head back and forth while sucking the penis

To help trigger throat bulging (instable at the moment, need more work):

Her throat is (prominently|slightly) bulging as the man's penis penetrates her mouth and throat

Example prompt:

blowjob, deepthroat; a video on a woman with fair skin performing a deepthroat blowjob. She kneeling in front of a man. The man is moving his penis back and forth in the woman mouth and throat, while she is sucking the man's penis, the penis is going deep into her mouth and throat. Her throat is prominently bulging as the man's penis penetrates her mouth and throat.

Training

This is my first LoRa training, any feedback is welcome. I'm pretty happy with the final result, but there are still details to work on (throat bulging, stability in reverse, handle spit better, ...).

Trained on Wan 2.1 480p I2V in fp8 scaled where it triggers the best, it seems to work not so well with 720p I2V (no amplitude, but the concept is still here).

Datasets is composed of 60 videos in 480x270 down-sampled from 1920x1080.
It was trained over 60 epochs at 2e-4 learning rate, with LoRA+ at 4 on a 5090 (~6h).
Network dimension is set to 24.

For those that want to known how I trained it, I explain it a bit in this CivitAI discussion. I'm no expert, but it can still help!

Description

Initial training.
Support most angles and use cases like POV, side view, below, upside down/reverse, 69, on lap (unstable), throat bulging (unstable).

Weight Range: 0.7 - 1
Recommended Weight: 1

FAQ

Comments (70)

CyberAImaniaApr 22, 2025· 4 reactions

CivitAI

What was your:

gradient_accumulation_steps = ??

blocks_to_swap = ??

num_repeats = ??

I'm surprised that training on the I2V model, and quite high resolution clips 480px, you managed to fit in 32GB VRAM and only 6 hours of training time?

What was the literalization speed? XXXXX???? samples/sec

Share your toml configs data and training, please.

JeeFJ

Author

Apr 22, 2025· 9 reactions

Hey there! Happy to share!
Well, during training, it was pretty tight in term of VRAM (31GB/32GB), but I managed to do it with those configurations:

For the toml (no sub lengths, I cured it manually to have only the movements I wanted to train):
---
[general]

resolution = [480, 270]

batch_size = 1

enable_bucket = true

bucket_no_upscale = false

[[datasets]]

video_jsonl_file = "./use-case/config.jsonl"

cache_directory = "./cache/use-case"

frame_extraction = "full"

max_frames = 65

source_fps = 16.0

[[datasets]]
<same for other datasets>

---

Videos were regrouped in 7 sub datasets (max 20 videos per dataset, max 4s each, representing a use case), no repeat, everything was pre-cached on an SSD before training.
It is notable that the pre-caching downscaled a bit to 464x272 (to match a good number for Wan).

Train config:
---
accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 wan_train_network.py \

--task i2v-14B \

--dit ./models/diffusion_models/wan/wan2.1_i2v_480p_14B_bf16.safetensors \

--dataset_config ./deepthroat.toml --sdpa --mixed_precision bf16 --fp8_base --fp8_scaled \

--optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing \

--max_data_loader_n_workers 1 \

--blocks_to_swap 0 --network_args loraplus_lr_ratio=4 \

--network_module networks.lora_wan --network_dim 24 \

--timestep_sampling shift --discrete_flow_shift 4.0 \

--max_train_epochs 60 --save_every_n_epochs 1 --seed 989 \

--logging_dir ./logs/ \

--output_dir ./deepthroat/ --output_name jfj-deepthroat
---

Over the default musubi-tuner configuration, I removed --persistent_data_loader_workers (free up VRAM for not so much loss in time), used --fp8-base & --fp8_scaled
No block swap needed, so I can achieve ~6.1s/it
Diminishing the --network_dim helped too I think (and it was great for a concept LoRA).

Hope it helps!

CyberAImaniaApr 22, 2025· 3 reactions

@JeeFJ Hey man, seriously, thanks a ton for sharing all that detail! Really appreciate it.

Okay, so I see you're running the training directly with wan_train_network.py and not through something like the standard Diffusers training scripts, which is what I'm doing. That clears up some things!

Looking at your numbers – ~6 hours training time and 60 epochs, plus the ~6.1s/it speed you mentioned – doing some quick math, that puts the total steps around 3500, give or take? Is that about right?

'Cause honestly, your LoRA turned out great, and I'm just trying to figure out how you nailed it so well with what seems like a pretty low step count compared to what I'd expect, especially for video. Maybe I'm missing something in how wan_train_network.py handles things or how effective those fp8 settings are?

Anyway, killer results! Just curious about the magic behind getting it done so quickly and effectively.

Cheers!

JeeFJ

Author

Apr 22, 2025· 4 reactions

@CyberAImania 3600 steps to be precise. The iteration time decreased with time (started around 6.32s/it and got stable at 6.07s/it starting epoch 28).

To be honest, I'm far from being an expert. I have been fiddling around with musubi-tuner for 2 weeks since I received my (fucking pricey) 5090, and it's the first time I train on Wan (and I have little experience on other models too). But let me tell you how I got here.

During my initial tests at 160p, I played around with the learning rate (1e-4, 5e-5) & LoRA+ (4, 8, 16) at 40 epochs with only 20 videos and a network dim set at 32. Wasn't that great, lots of over-fitting, overcooking, the concept was difficult to trigger and results were... Meh.

I initially wanted to first train on a small dataset to get the concept, and try to fine tune it multiple times with new datasets and differents use cases, which I tried but the concept wasn't fully there yet.

So, I got frustrated a bit, and just packed all my datasets that represent the action in all angles ("Wan will maybe get it if I do a 360° of the concept?"), pruned datasets about very specific use cases that wasn't necessary, reworked most captions to be as precise and descriptive as possible (I helped myself with Florence2 visual model). I decided to use 24 dim since it created better results in my previous steps (and I remembered that concept = low dimension from SDXL), and I pumped the epochs count to 60 because my last run at 40 was under-trained.
All of that on a 360p run to test my assumptions (~3h). The result was very good finally ! This forsaken model got it !
But hairs were messed up by the original resolution, so I did yesterday the 480p run, and here it is.

I also think that my dataset selection was pretty important, once everything was trained together, it always went great. I hand picked the movement in each source video with different speeds (with same captions but changed the speed word, I though that the model would understand the delta better), also selected where it started and finished (from the base, from the top, etc), in all angles, positions, etc.

It's still lacking on details though, since I brute forced it without fine tuning. I wonder if I can fine tune it without overcooking it...

Well, I hope it helps! I still have a lot to learn!

CyberAImaniaApr 22, 2025· 1 reaction

@JeeFJ Wow, man, huge thanks again for laying it all out like that! Seriously helpful stuff, especially hearing about your whole trial-and-error process. Mad respect for digging into it for two weeks straight after getting that beast of a 5090 – must be nice having that extra VRAM headroom!

So, 3600 steps on the dot, got it. And interesting that the iteration time sped up a bit as it went – cool!

Yeah, I'm actually using a different training tool myself (Diffusion-Pipe) and rocking "only" an RTX 4090 with 24GB VRAM, so I'm constantly bumping against those memory limits, haha. I'm also pretty new to this whole training adventure, just starting out really. I've only done a few LoRAs before, mostly training celebrity likenesses using still images (T2V).

I'm really keen to try training on video clips next, like you did for the motion/concept, but honestly, the thought of manually captioning potentially dozens or hundreds of video clips is kind of daunting! You mentioned you helped yourself with the Florence2 visual model for captions – that sounds super interesting! Could you maybe tell me a bit more about that? Like, how does it work? Does it automatically generate descriptions for video frames or something? Any tips on using it would be awesome, 'cause doing all those captions by hand sounds like a nightmare!

Seriously though, your results are inspiring, especially for a "concept" LoRA trained with relatively low steps. It really shows how much impact careful data selection, good captions, and finding the right settings (like that lower dim/rank) can have. Sounds like that "360° of the concept" approach with all angles really paid off!

It definitely helps a lot hearing your journey. Thanks again for sharing, and good luck with potentially fine-tuning without overcooking it – sounds like the next challenge! Cheers!

JeeFJ

Author

Apr 22, 2025· 1 reaction

@CyberAImania Thanks, I appreciate it! The VRAM headroom is indeed nice aha! I miss my kidneys though.

About Florence2, it's just a plugin in ComfyUI you can install (comfyui-florence2). I load the video in ComfyUI, extract one frame, and run it though Florence2 in "more_detailed_captions" mode, it output a detailed description of the image that I use as a base to add movement captions & my trigger words (double check the output though, it sometimes hallucinate).
I didn't automated it to create the .txt file or update the .jsonl, but I think it's doable. I just copy paste the captions in my .jsonl file, tweak it, and load the next video. I don't recommend using the captions as-is though.

It's a bit tedious though, took 2/3h for the 60 videos. Some videos were extracted from the same source video, so I didn't run ALL 60 videos through Florence2, just copy pasted and quickly changed it manually. But I think it's something we need to go through for good results.

For video splitting, I use Shotcut, it's free, works well.
For dataset management (transcoding, etc) I just use a lil' script I made using ffmeg, always in lossless. I can convert my whole datasets to another resolution quickly with that, so I can try things out and iterate.

And shit tons of tea and funny videos to watch while the GPU is melting under all the training iterations ahahaha

CyberAImaniaApr 22, 2025· 1 reaction

@JeeFJ I uploaded a short ~4-second MP4 clip to https://aistudio.google.com/ using the Gemini 2.5 Pro model, just as a test — and to my great surprise, it described the clip. And the clip contained… you know what... xd.

https://ibb.co/cK4jnqW1

JeeFJ

Author

Apr 22, 2025· 3 reactions

@CyberAImania I'm surprised that those cloud based AI are not filtered. The caption is a bit short though, and danbooru tag style, I think, doesn't work that well with Wan since it seems to prefer verbose captions, and it's difficult to describe a movement with only tags.

Here is an example of a raw Florence2 output on one of my training video's frame, if it can helps those that will read this discussion:
---
a high-resolution, explicit image depicting a sexual act, the central figure is a young woman with light skin and long, brown hair, kneeling on a beige carpeted floor, she has a slender physique with small to medium-sized breasts, and her nipples are visible, her facial expression is one of intense focus, with her eyes half-closed and her lips parted, as she performs oral sex on a man whose erect penis is prominently displayed in the foreground, the man's skin tone is light, and his pubic hair is visible, the background includes a white towel on the left side and a black power cord on the right side, suggesting a domestic setting, the image is taken from a low angle, looking up at the woman and the man, emphasizing the intimate and explicit nature of the act
---

Which I manually transformed to:
---
blowjob, deepthroat, pov; a POV on a young woman with light skin and long, brown hair, kneeling on a beige carpeted floor, she has a slender physique with nude medium-sized breasts, and her nipples are visible, her facial expression is one of intense focus, with her eyes half-closed and her lips parted, as she performs a deepthroat blowjob on a man whose erect penis is prominently displayed in the foreground, she is bobbing her head back and forth while sucking the man's penis, the man's skin tone is light, and his pubic hair is visible, the background includes a white towel on the left side and a black power cord on the right side, the video is taken from a POV, looking down at the woman and the man's penis
---

tensor_fanaticApr 24, 2025

@JeeFJ could you share a caption example?

JeeFJ

Author

Apr 25, 2025

@tensor_fanatic See my last message in this discussion.

photoai43Jun 16, 2025

@JeeFJ Thanks for all of this! Very helpful. I've been struggling with the i2v motion style lora. Your caption is interesting, will try a version like this. I captioned mine without describing the motion, and have been failing to generate a good i2v. t2v works great thought.

I am using Diffusion Pipe, why did you choose musubi-tuner ? Ill look into switching to see if that helps.

oliviierbommel7159Aug 8, 2025· 1 reaction

JeeFJ Most of what you just said here is aba-cadabra to me..
Is there any tutorial you can recommend to watch in order to get to the level where I can understand what you're saying and venture in trying this myself? :-)

_Species8472_Dec 9, 2025

This is by far one of the all-time best LoRAs for WAN 2.2 - great job! I was excited to find this thread where you share some details @JeeFJ . One of the most impressive things about this LoRA is the penis fidelity. I find training anything in WAN that involves a penis to be a huge pain in the ass because of the censored base model. Could you shed some light on how exactly you over came this? Did you dump a bunch of pics of penises into your dataset? Or were you careful to select only videos that featured a full, unobstructed view of the penis at some point in the scene?

JeeFJ

Author

Dec 22, 2025· 1 reaction

@_Species8472_ Hey there. First of all, note that this discussion was about the Wan 2.1 version. Wan 2.2 I2V is similar, but I did added penises datasets to it, because 2.1 I2V suffered from deformed penises when they were introduced to the image (so not present originally). I wanted to circumvent that a bit, so I trained the LoRa on it too (it had the funny side effect to attach penises to all new characters introduced to the scene aha, some may like it ahah).

Wan 2.2 I2V has the same datasets as Wan 2.1 I2V, with an addition of 27 static images of penises (erect or not), and 14 videos of moving penises during different manipulation and state, to teach the model what is a penis first, and then how is moves and react to stimulus (does it bounce? is it hard? elastic? etc). I built this added dataset with this goal in mind.
The final result is not perfect tbh, but seemed fair enough, especially in the interaction with it & its physicality, which to me convey the scene more convincingly (I hated those "monolithic" penises in other LoRa, they felt like wood or rubber).

Wan 2.2 T2V is a whole another subject (especially on the configuration). Same datasets as Wan 2.2 I2V though but now with vulva datasets (27 static images & 32 videos for physicality). Why do that for a Blowjob LoRa you would say? Because I wanted the lora to be self contained, and be able to output a good video by itself (and woman are often nude in those). It also allowed me to eliminate the side effect of all characters having penises aha (well at least partially). Not perfected too, but does the job good enough to me.

So to make it short, I overcame this by training with my dataset concept, mixed with related concept that were missing in the base model (penises & vaginas), so Wan can get wtf I'm trying to teach it directly. I believe it output slightly better output than mixing a LoRa about genitals & the concept LoRa, but I have nothing to back it up than my appreciation of the results.

Hope it helps.

_Species8472_Dec 22, 2025

@JeeFJ Thank you very much @JeeFJ . This might be too much to ask, but any chance you could upload your penis dataset to this LoRA as a training data .zip? You've achieved good penis aesthetics and I'd like to see if applying your supplemental penis dataset to any of my XXX LoRAs has the same effect.

JeeFJ

Author

Dec 27, 2025

@_Species8472_ Hey, unfortunately I can't share it.
I'm happy to share the process, the configurations & my approach though.

The static images is basically a bunch of cleaned-up photos of penis of various sizes, colours, with or without foreskin, flaccid or erect, from various angle (front, above, below, left/right, three-quarter, etc). I tried to include as much variability as possible. I try to cement the look of a penis with those.

The videos describes various interactions with the penis from different angles and framing (so it's correctly placed between the legs), like handling with a hand, moving it erect or flaccid, pulling the foreskin back, etc.
I considered that adding this context to the training would improve the data learned from the original blowjob dataset, since the model would now know what the keyword "penis" is in the training captions, and spend more time refining the concept of penis and the interaction it has with the subject's mouth, which it would categorise as "blowjob".

If the base Wan 2.2 models was NSFW, it wouldn't be necessary, but it isn't, so I feel like we need to include those kind of datasets in all XXX loras if want them to be usable with the base model.

By the way, I'm sidetracking a bit, but I tried my lora on the the recent Smoothmix NSFW Wan 2.2 model, it works sooooo much better in T2V, amplitude, generalisation, control, everything is awesome. I guess the success of an XXX lora on the base model is mainly based on its ability to jailbreak the model out of SFW with good and profuse nudity datasets? Seems inefficient though...

Degenerator123Apr 22, 2025· 7 reactions

CivitAI

Great LoRA, works very well! Some other I2V LoRAs are quite "picky" in terms of input image, yours seems to work with a variety of angles and compositions and has great motion. Out of curiosity I tried it out with a basic T2V (14B) workflow at strength 1 with a simple prompt and it seems to work for that too! (getting a list of "lora key not loaded ..." notifications in cmd console because it's trained for a different model but it still works).
Even without an extra penis LoRA it seems to grasp what it's supposed to look like in T2V, though it probably works more consistently with addition of a penis LoRA at a lower strength because base WAN is pretty clueless in that regard.
Great job, especially since this is your first LoRA, and thanks for the detailed comments sharing your approach!

JeeFJ

Author

Apr 22, 2025· 1 reaction

Wow that's awesome to read! So it does works on T2V, I didn't even tried, I though it was a lost cause aha! I wonder if I can convert it by pruning some blocks to avoid those warnings?

I'm happy also to read that it triggers well in most of the use cases, that was my original intention & motivation.
The first DT T2V LoRA that went out a month ago is pretty good, but difficult to trigger and alter the original image too much to my taste. I was also frustrated by the ex-nihillo penis lengthening, or the disappearance of the glans when the subject pulled back. I think I partially solved it (at least, most of the seeds, some are still cursed), so mission (partially) accomplished! :D

I was also surprised that breasts & penis physics went out great, thus making this LoRA good by itself. I'm not sure why yet, but I take it!
I even tried to do an I2V on a random image of someone sitting on the ground, the LoRA triggered so well that the subject started to suck its own fingers ahahah! This LoRA definitely needs to chill, where is the horny police...

All in all, thanks! And happy to share :) Apes strong together.

tazmannner379Apr 25, 2025

I don't know why either but quite a few i2v loras I've tried will work fine with t2v and even mix well with other loras.

WhatTheGuyApr 22, 2025· 1 reaction

CivitAI

Did you train it on the I2V model? Because for T2V it's a bit wanky. Would love if you trained a T2V version too =)

JeeFJ

Author

Apr 22, 2025· 1 reaction

Hey! Yes, I2V only, since there is already quite some T2V LoRA available. Another commenter told me it works pretty good on T2V though, but looking at your comment, it seems it's far from perfect.

But I'll take a look at training a T2V version, not sure when though :)

WhatTheGuyApr 23, 2025· 1 reaction

@JeeFJ cool, thanks ! From what I read training on the T2V model works well for both, but training on the I2V just works really well for I2V and not so great for T2V. Yes there are a few T2V blowjob Loras out there right now, but noone made for the upside down shots =)

nerfmeApr 26, 2025

I heard it is possible to use the dedicated i2v models as t2v by just substituting the image latent to a empty latent video, or something similar. Not exactly sure how but should be doable.

WhatTheGuyApr 26, 2025

@nerfme hm sounds like that is the standart way to do T2V. Feeding a empty latent video to the sampler. It works, but not very good

nerfmeApr 28, 2025

@WhatTheGuy Sorry for late reply. I remembered the trick that I read somewhere. To use I2V as if it were T2V:

Prompt and create a single frame from an empty latent video

decode and feed just that single first frame latent to your normal I2V workflow, from your ksampler, to where the first-frame image latent would normally go.

Then you effectively have I2V but you're prompting with WAN as you would with a T2V. The model should work the same then, as if using an image, although still slower than normal T2V.

WhatTheGuyApr 28, 2025

@nerfme ah ok, but the problem is that with the I2V lora I will get a wanky first frame out of it. So I have no improvement, just more work =). I guess it will work when prompting for an concept which the base model already knows well, and I want to add the motion. But I depent on the Lora to create the first frame., since the base model doesn't know how an upside down blowjob looks like ...

JeeFJ

Author

Apr 28, 2025

@WhatTheGuy I'm working on a T2V version, I've had a 8h train that just finished, no guarantee when it will be out though, as T2V seems more difficult to train than I2V (or I got lucky on I2V?).

nerfmeApr 28, 2025

@WhatTheGuy Well, technically the concept will still work, but take a bit longer till you find a working frame I guess XD. I would make like a 20-frame latent or something, decode into a save-image node, pick out the good one, and throw it at the I2V. Heh. Well at least JeeFJ is onto your perma solution it seems. Good luck!

WhatTheGuyApr 28, 2025

@JeeFJ I guess it is more difficult because you don't need just the motion any more, but the actual image ( starting pose, knowing how a penis looks like, know how the mouth is shaped). But big thanks! Looking forward to the T2V release =)

JeeFJ

Author

Apr 29, 2025

@WhatTheGuy Yeah, I noticed that I need to fill the gap on how a male genital looks, I've got some pretty cursed inferences after my firsts tests ahahah (the great curse of glans disintegration returned). So far, I've tried still image of penises alone, to anchor their concept, and let the original I2V videos stitches the rest together, there were some improvements, but the LoRa is still too finicky without specific trigger words and not very malleable. Noticed also that inferences were more dynamic, and prone to hallucination, without still images.
I'm still exploring... Maybe including DT still images instead of penises alone ? Or solo penises videos ?

Ooooh... I think I've got it.
I'm realising writing this that I should probably take this T2V training as a regular stable diffusion training (to be able to generate a good first frame), with added movements information to it to infer the rest of the video from the first frame. That would make a lot of sense. :o

dasda1234Apr 23, 2025· 11 reactions

CivitAI

Really consistent vs other similar LORAs I tried for Wan. Good job.

AstroWeaselApr 26, 2025· 7 reactions

CivitAI

really cool lora. The other wan deepthroat lora does not work well for me with I2V model. Maybe because It was trained initially on T2V model. Are you considering training T2V model. Maybe you don't even need to train from scratch as the models are quite similarm. Maybe you can finetune this one.

JeeFJ

Author

Apr 26, 2025· 1 reaction

Thanks! I appreciate the comment :)
Yes, I'm currently looking into building a T2V version. I'll release it if it have the same qualities as I2V (easy trigger, all angles, leave the model free enough to hallucinate, etc).
Currently doing the test runs at low res.

aiempath45836May 12, 2025· 18 reactions

CivitAI

i keep getting a whole man's head come into frame and they both start sucking on a double ended dick like the lady and the tramp or something lol anyway to avoid this? It seems to happen when trying to get a penis coming into view from the side or something, with images not containing a penis from the start. But even with NSFW images already containing penis it may or may not have this same issue when generating. I've already tried including words like male head, extra head etc in the negative prompt. Any parameter i should try changing maybe?

helkingeo833May 13, 2025· 4 reactions

First sentence put me in the ground with laughter

JeeFJ

Author

May 13, 2025· 1 reaction

Inference for images without penises already present is a bit more difficult to trigger cleanly, I recommend including one in the source image by in-painting if necessary for cleaner results. But you can play with the Shift & CFG & Seed to try to get a somewhat good result out of it. Coupling my LoRa with a Penis LoRa could also improve the result (so Wan know better what a penis looks like).

About the head appearing out of nowhere, it can happens if the CFG or the Shift are too high (Wan hallucinates too much) or you are on a cursed seed, at least to my experience. You can try lowering a bit the LoRa weight as a last resort if it's still broken (down to 0.7 minimum, lower is broken).

It could be also the prompt being too evasive, did you try to use my helper phrases I describe in the LoRa presentation ?

JeeFJ

Author

May 13, 2025· 1 reaction

You have a working example here: https://civitai.com/images/75908889
No penis in the original image, but it could be hallucinated by coupling a dick lora, and a careful prompting.

allamallaJun 4, 2025

@helkingeo833 This is an all time post for real.

DontMindMeLoveJun 4, 2025

It also tends to crank out a lot of 2-3ft long Jonson's, and hermaphrodite like features. Easy to go off the rails with it.

ak4710315462Aug 11, 2025

I also encountered the problem of men's faces. Later I found that the prompt words could not be "man", "man hip", or "man penis", but only the words hip and penis.

8190024May 17, 2025· 2 reactions

CivitAI

How can i download a workflow for this, can someone help me?

Gekko78May 18, 2025· 4 reactions

You can use this one: https://civitai.com/models/1385056/wan-21-image-to-video-fast-workflow
it will automatically download the gguf model you pick (i use the Q4_K_M version and it's relatively fast and good quality, up to 9,5 seconds of generation on 12gb of VRAM). Just make sure you follow the instructions on how to install the custom node and upscale model if you need it (upscale model not required and can be skipped by disabling it)

cocoleviAug 28, 2025

@Gekko78 there is no exist anymore to download u_u

atbesatb946May 18, 2025· 3 reactions

CivitAI

Amazing!

AjaxdiffusionJun 2, 2025· 8 reactions

CivitAI

This is BY FAR the best LoRa I've worked with so far...

8 out of 10 gens are not just fine, but great! o_O

GoonerTunerJun 3, 2025· 8 reactions

CivitAI

The lorn is not showing up in the added resources

lgkingluis835Jun 10, 2025· 7 reactions

CivitAI

i cant use most wan loras i dont know why

GoonerTunerJun 11, 2025

I am having the same issue, its not coming up on the list

YamamayYamamotoJun 15, 2025

yeah it's happening to a bunch of loras, can't use them on site

JeeFJ

Author

Jun 27, 2025

I'm not sure, but I think it's because the model identifier per version changed. I've just updated mine to represent "Wan i2v 480p", I suppose it should solve the problem (at least I can select it in the cloud generation UI)

wasabi789Jul 1, 2025· 5 reactions

CivitAI

this Lora unavailable for wan txt2vid , its work before they removed Wan , now cant

bjbootyloverJul 8, 2025· 6 reactions

CivitAI

Great lora !

cukurpapiks383Jul 10, 2025· 8 reactions

CivitAI

what prompts are you all using to get balls deep deepthroat? It doesnt really work in combination with https://civitai.com/models/1719863?modelVersionId=1972141

a1161327317Jul 28, 2025· 3 reactions

CivitAI

I don't know why, the performance in the anime is very strange. Either the movements are very small, or there will be a male head taking "penis" appearing.

ak4710315462Aug 11, 2025

I also encountered the problem of men's faces. Later I found that the prompt words could not be "man", "man hip", or "man penis", but only the words hip and penis.

a1161327317Jul 29, 2025· 8 reactions

CivitAI

My animated penis always turns into an alien form

aiv_creatorAug 3, 2025· 16 reactions

CivitAI

IMPORTANT: INFORM ALL WAN USERS! I DISCOVERED HOW TO USE LORAS 'NSFW/SFW' IN WAN 2.2 T2V, GETTING THE FULL POTENTIAL OF "HIGH NOISE" AND "LOW NOISE." THE RESULTS WERE IMPRESSIVE!! I ran it on an RTX 3060 6GB, 32GB RAM, 480x480, length: 41, generation time: 5 minutes to generate each video. The quality is fantastic! For those of you with more powerful computers, the results will be even more incredible. I used the models: WAN 2.2 - Q4_1.gguf FOLLOW THE STEPS BELOW: 1: Don't use acceleration/enhancement LoRas (such as Lightx2v or other) on "high noise"; use only on "low noise" with strength 1. (I'm using this version of Lightx2v LoRa): lightx2v_T2V_14B_cfg_step_distill_v2_lora_rank64_bf16 2: Use 'NSFW/SFW' LoRas on both "high noise" and "low noise" models. On "high noise," use strength 2, and on "low noise," use half the strength, with strength 1. You can vary the strengths as you like, but always keep the "low noise" strength at half the strength of the "high noise." 3: Use these settings in KSamplers: "high noise": steps: 9 / cfg: 3.5 / euler/simple / start_at_step: 0 / end_at_step: 4 "low noise": steps: 9 / cfg: 1.0 / euler/simple / start_at_step: 4 / end_at_step: 10000 Note: To improve speed, I set the Windows system to "best performance," disable unnecessary background programs and applications, and lower the desktop resolution. This significantly speeds up video generation. I hope this helps.

Yulian0178_Aug 28, 2025

i have a rtx 3060 12gbvram and 16ram, and in the first video I made (I'm new to this), it took me 30 minutes to do, apart from completely changing the person's face (it was a warrior, nothing NSFW). How did you manage to do it in 5 minutes? Well, I'll follow your steps and see if it works for me.

Yulian0178_Aug 28, 2025

I don't know which step it was, but it worked like a charm. Now she keeps her warrior face for some reason, and it went from half an hour to just 11 minutes. Maybe if I use sageattention I'll get it down to 5 minutes, but I don't know how to set it up, ha ha ha.

THANK YOU SO MUCH, IT WORKED GREAT FOR ME.

HerishopAug 10, 2025· 4 reactions

CivitAI

This lora is getting an error. Fix soon admin

tommo5608Aug 30, 2025

*please

ak4710315462Aug 11, 2025· 5 reactions

CivitAI

I also encountered the problem of men's faces. Later I found that the prompt words could not be "man", "man hip", or "man penis", but only the words hip and penis.

datlurkaaAug 27, 2025

this is great advise, thank you

vusyrvisehievhAug 15, 2025· 7 reactions

CivitAI

Wan 2.2 please! Great Lora

JeeFJ

Author

Aug 19, 2025

For what I've tried, the current LoRa version can output some pretty good results on Wan 2.2, though not perfect, but the movement is here. You just need to push the LoRa weight to 2 (on the high noise and low noise).

JeeFJ

Author

Aug 27, 2025· 1 reaction

It's out! I hope it'll be as good as the 2.1 version :D

MortalounetAug 15, 2025· 6 reactions

CivitAI

Hello, I am new in AI photo generation, and looking for someone who could teach me how to generate videos like the one above, and I pay you for this service of course ;) thank you!

BaltiusAug 19, 2025

just surch for comfyui text 2 video wan, plenty of ressources knowing that (pretty new on the subject and trying to make it work to) personaly I use Pinokio fort ai tools ;)

PetrKAug 27, 2025· 1 reaction

Hello, I was in a same situation as you. The problem is that even if you are willing to pay someone, nobody really signs up for it. The whole generation thing is so complex, that everyone sort of makes it work on their PCs and nobody has the energy to make it work on someone else's setup. So I will tell you what worked out for me. Since you are willing to pay, I highly recommend this Patreon creatorAitrepreneur | Ai art, Large Language Models, Stable Diffusion, LLAMA, LORA | Patreon specifically this post that gives you want you want 1-Click INSTALL WAN 2.2 ULTRA VIDEO & IMAGE KING ComfyUI WebUI! | Patreon The best thing are the 1-click installers that remove the need to fully understand what to install. Even if stuff breaks on your PC, you can just run the installer again and get back to a functional version. Next to that, you also get access to functional and "none-spaghetti" workflows that actually work. I am not trying to promote a creator here, I am trying to show a way that does not require you to spend 100+ hours trying to figure out all by your self. I am now generating stuff locally on my PC pretty much daily, and using CivitAI as source for checkpoints, loras and prompts. Hope this helps.

vAnN47Sep 6, 2025

start with "pixaroma" on youtube, good place to start!

LORA

Wan Video 14B i2v 480p

by JeeFJ

Download (Beta) View on CivitAI