About this version (qwen v2)
A brushup of the dataset, with better result than V1 for qwen. I will use the same dataset again for zturbo when the base model is released.
Trigger word: PENISLORA
What can this lora do?
This lora can add erect penises to both men or women viewed from the front/side. Other angles such as POV may have a backwards penis head.
Other things it can now do:
Side view of the penis
Cumming / Cumshots
Blowjobs (its captioned for the words "blowjob" and "deepthroat" )
What can't it do?
No penetration in the training data. Also nothing from POV angle, though there is a few images from above and 1 POV video in the training data.
Sometimes blowjobs with cumming have the penis slip out the closed mouth.
Recommended Settings
It works pretty good with the new lightning dyno high model. I'll link to it in my example workflow. I like to use dyno high model (no lightning lora), then for low I use the lightning v2 lora on the regular 2.2 low base model.
Dataset
84 images at 512x resolution
43 videos at 256x resolution
(I let DP pick the aspect ratio automatically)
This is the same exact dataset as the 2.2 5B model. I made no changes.
Training
I used the default diffusion pipe settings.
[optimizer]
type = 'adamw_optimi'
lr = 2e-5
betas = [0.9, 0.99]
weight_decay = 0.01
eps = 1e-8
I was baffled why it was taking so long to train the high until I realized after over 60 hours of training that I had put my videos in the images directory which resulted in the high being trained ONLY only on videos and twice (once with a very high resolution). Once I fixed this, I went back and trained from 11K steps up to around 13K with the images in the training data. The high model was fine without to be honest.
For the low, I trained it properly with videos and images the whole way, around 6K steps in I upped the image resolution from 512 to 1024 actually and didn't get an OOM (it fit around 24GB exactly). I trained it to around 10.5K steps. Also I trained the low on the full timestep range (0 to 1 instead of 0 to 0.85) from some advice, it may switch better over from high to low on the speed up lora with low steps.
I think I might do another version with more angles such as POV and from the behind to make this work for any situation. In that case I don't think it needs 10K steps per training session, epochs around 5K steps looked fine.
The results
I think it was a combination of improved captioning and 2.2 base model being better. But this lora turned out really well.
Description
Summary:
This lora is designed to generate frontal view of the penis (including side view), masturbating, blow jobs / deepthroat, and cumming all in one. Some things it does better than others. It works for both men and women (or anything you want to attached penis to...) Give it a try and post your generations on my page, I love to see my loras used :)
Trigger: PENISLORA (chuck it in the front of the prompt)
(btw I also have a 1.3B lora, check it out, I will update it to v0.8 as well later)
Recommended Strength:
I need more time to test it, I think between 0.8-1 is fine, lower will give weird penis shapes.
Notes:
Despite the fact that I lowered video resolution, I was still met with almost all of A100 vram usage. My method to lower the resolution may have made the clips too low quality and I think there may be a slight blur to some generations. Try an earlier epoch like 10 to compare on the same seed to make sure, or just try another seed. I cannot for the life of me get it to get the cumming consistently, you can get some good ones or you will get the fucking cum come out of her mouth a million times. My major worry is the blur, please report if its a deal breaker and if this version needs to be redone :(
The good news is that almost always you get a good looking penis now from many different angles (you can get it to look you straight on from above the head now). You should try to anchor the penis like v0.7 release notes if you get detached penis. Reference the training data zip for an example workflow and also all the captions I put in a csv file for you there :) (I recommend if using an LMM to generate prompts to feed it that csv file to reference for penis related terms)
What I learned from 0.7 is that it can do a lot more than I thought, I just need time like a week or so of running through different strengths and prompts to see how to get it to do what I want properly. Already though its a lot easier than before, so I have high hopes for this version.
Example Prompts:
I will add some later, but just check any video I post in this version, I will include the prompt.
Negative Prompt:
色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,倒着走, 模糊的眼睛,低细节面部,水印, 图层, 故障, 面, 屏幕残影, 图像干扰, 刺青, tattoo, 晒痕, muscles, biceps,
For this ^ you can remove muscles/biceps if you want. Also add "cum from mouth" or "cum on lips" if trying to do a cumming related video otherwise cum will probably cum out of her mouth.
Can do well:
Generate a good looking penis from front / side / below / facing directly to the camera angles
Blowjob (including deepthroat, and you wont get detached penis heads if she takes the penis out of her mouth like other blowjob loras often do)
Tips for certain actions:
For cumming try: "cum shoots from the tip of the penis" (this is used in the captioning as well as "cum leaking")
For blow jobs use "blowjob" or "deepthroat" both of those are captioned as one word.
For masturbating use "stroking the penis" or "grabbing/gripping" the "penis shaft". Though "shaft" and "base" are captioned it doesn't quite know what what part of the penis is despite my efforts.
Kinda can do well:
Cumming and cum shots (negative prompt should include "cum from mouth" and "cum on lips" etc as it tends to shoot cum from both the penis and the mouth).
Cannot do well:
POV / view from back of penis (penis head will be random position since its not trained on this angle at all, I feel POV loras plenty of them exist already and they do good enough)
Data set:
-71 images (8 new images from last version), 512x512 resolution
-40 videos (13 new videos from last version) , 512x320 resolution
(why so few data each time? I am doing captioning by hand as detailed as I can, I am trying for quality over quantity, but I do think this dataset could use more images). I noticed one or two typo or mistakes in the captioning but I didnt wanna restart and lose 10 hours of training, it should be ok though.
Training:
Loaded lora weights from v0.7 using Diffusion Pipe on Runpod
Steps additional steps since last version 4450,
16 Epochs total at 5 repeats each (it took around 3 hrs per epoch)
Total training time of around 56 hours (a little over 100 usd on runpod on an A100 at 98% GPU vram usage)
Credits:
Seruva19's ghibli lora and detailed breakdown of his process and sample dataset/captions
Hearmeman's diffusion pipe template for runpod and his simple to follow instructions
Kijai's feedback on proper workflow settings and all the people in the Banodoco discord answering my questions