????? Lora (+???????, +???????) [Taz] - WAN 2.2 14b / 5B / 1.3b T2V & I2V (Wan 2.1 & 2.2)

????? Lora (+???????, +???????) [Taz] - WAN 2.2 14b / 5B / 1.3b T2V & I2V (Wan 2.1 & 2.2) - v0.7C Wan 2.1 1.3B T2V

NSFW

About this version

I trained using the newly re-captioned dataset from the 5B model. The result is incredibly good. For the first time I'm pretty happy with the result. Give it a try. I haven't tested I2V, it should work for that though. Most examples are with lightning speed lora and low resolution (480x832)

Trigger word: PENISLORA

What can this lora do?

This lora can add ????????????? to both men or women viewed from the front/side. Other angles such as POV may have a backwards ????? head.

Other things it can now do:
Side view of the penis

Cumming / Cumshots

Blowjobs (its captioned for the words "???????" and "??????????" )

What can't it do?

No penetration in the training data. Also nothing from POV angle, though there is a few images from above and 1 POV video in the training data.

Sometimes ???????? with ??????? have the ????? slip out the closed mouth.

Recommended Settings

It works pretty good with the new lightning dyno high model. I'll link to it in my example workflow. I like to use dyno high model (no lightning lora), then for low I use the lightning v2 lora on the regular 2.2 low base model.

Dataset

84 images at 512x resolution

43 videos at 256x resolution

(I let DP pick the aspect ratio automatically)

This is the same exact dataset as the 2.2 5B model. I made no changes.

Training

I used the default diffusion pipe settings.

[optimizer]

type = 'adamw_optimi'

lr = 2e-5

betas = [0.9, 0.99]

weight_decay = 0.01

eps = 1e-8

I was baffled why it was taking so long to train the high until I realized after over 60 hours of training that I had put my videos in the images directory which resulted in the high being trained ONLY only on videos and twice (once with a very high resolution). Once I fixed this, I went back and trained from 11K steps up to around 13K with the images in the training data. The high model was fine without to be honest.

For the low, I trained it properly with videos and images the whole way, around 6K steps in I upped the image resolution from 512 to 1024 actually and didn't get an OOM (it fit around 24GB exactly). I trained it to around 10.5K steps. Also I trained the low on the full timestep range (0 to 1 instead of 0 to 0.85) from some advice, it may switch better over from high to low on the speed up lora with low steps.

I think I might do another version with more angles such as POV and from the behind to make this work for any situation. In that case I don't think it needs 10K steps per training session, epochs around 5K steps looked fine.

The results

I think it was a combination of improved captioning and 2.2 base model being better. But this lora turned out really well.

Description

Notes about 1.3B version which is whats new here:

Warning: I haven't done much in terms of testing, this may be overtrained. Try the epoch99 and compare to 109, 99 had good results and I haven't had time to do comparisons or do many generations at 109epochs. I'll do some tests later and if its better then I will revert to 100 epoch

Trigger: PENISLORA

This is a 1.3B T2V version of my ????? lora. It does for the most part everything the 14B version does(though a bit more poorly).

Trained to around 109 epochs and 23,900 steps (109 hours, around 4.5 days straight of training locally on my 3090). Unless I add new training data, I think we are starting to get into over trained territory. In the coming days, I will work on adding more data and retraining the 14B model lora and if it works out then I can update this 1.3B with a v0.8. Check the example generations for example prompts or try from those listed in 14B v0.7 lora.