Amorous Lesbian Kisses - CivArchive (CivitAI Archive)

Amorous Lesbian Kisses - Wan

NSFW

~Amorous Lesbian Kisses~
Update: Ya'll the Wan version is pretty fire I'm quite pleased with it. I'm gonna try to replicate those results for Hunyuan now!

Kisses for Wan: It's been a long time coming, but I've finally successfully created a Wan version of this model! It seems competent for both T2V and I2V. A big key was using 16fps, Wan's native, so if you train Wan I'd definitely recommend that! FWIW my example videos have been interpolated to 32FPS using https://github.com/GSeanCDAT/GIMM-VFI which is really excellent. Anyway, I trained it at 480x272, 69 frames at 16fps times 30 videos for 2400 steps at 2e-5 with loraplus of 4 using Musubi Tuner. I removed the leading "amorous kissing" but otherwise the prompting format remains the same:

"close up of two young women tongue kissing. The woman on the left has red hair and is wearing a black lace choker, the woman on the right is Indian, with beautiful light skin and long straight black hair."

Tongue kissing, making out, kissing, wide shot, medium shot, close up should all be hotwords! Wan especially picked up the "making out" keyword really nicely and if you include it you will get lots of caresses and touches. It also manages tongue interactions better than Hunyuan. My examples were made with Musubi Tuner in about 20 minutes each! I use Musubi with a scheduled CFG, I do the first ten steps and the last three, but every other otherwise. This gains good speed without sacrificing much if any quality! I've also been experimenting with skip layer guidance which is curious and seems to really boost quality. Oh I also use fp8 scaled which is a huge boon. Musubi's implementation is online, which means you start with the full model(not the pre scaled ones). It keeps some smaller but very important params in full precision while quantizing the weights themselves to fp8 maintaining only 2.5% quantization error(vs 12.5% for a naive cast to e4m3fn!). I've ran several same seed comparisons and it's not just good in the numbers, it's consistently the closest results to the full unquantized model of any method I've tried. Comfy has fp8 scaled too but it's done differently(the weights are saved scaled and you just load that) but I hear it's really good too. Hurray for democratizing access!

Original/Hunyuan:

This has been a tough nut to crack, likely because of the complex hand and tongue movements involved. Base Hunyuan will make simple platonic kisses but not much more. This LORA is focused on creating amorous, sexual kisses and making out between women. It was trained on my RTX 4070 Ti SUPER 16GB with Musubi Tuner in 12 hours. This is the first revision worth sharing, it's not perfect but can definitely make some nice things! Expect updates! Caption/prompting format:

"amorous kissing, medium shot of two nude young women tongue kissing and making out with each other in a living room. The woman on the left has her brunette hair in pigtails and a tattoo on her arm while the woman on the right has brunette hair in a ponytail. Behind them a couch with some pillows and some plants can be seen."

"amorous kissing, wide shot of two women laying on a gray couch in each other's arms, making out and tongue kissing passionately. They both have brunette hair, one is wearing a colorful haltertop and shorts and the other is wearing a white dress"

"amorous kissing, close up of two women kissing sensually in front of a bright window. The woman on the left has red hair and is wearing a black jacket, the woman on the right is wearing a beanie and thick black glasses. Both of them are wearing mascara"

Small note: "making out" was used to indicate lots of caresses and occasional sexual touches accompanying the kissing, but I don't think it took super well in this first revision! "tongue kissing" was used when there was a lot of visible, outside the mouth tongue action, "kissing" if not as much or it's contained inside the mouth. "wide shot" was used if the full body is visible, "medium shot" for waist up, and "close up" for the close ups. Oh and "passionately" was used as a modifier if the kisses were extra enthusiastic compared to the dataset overall.

Recommendations:
Weight: 0.8-1.0
Flow shift: ~9.0 @ 544p
Guidance: <= 7.0 (Too much creates more issues with hands)
Steps: 50
Frames: 61-129 (longer may or may not work, wasn't trained)
*Reports and my experiments indicate that Teacache may create issues with the LORA so please try without it if possible.

Dataset consisted of 26 high quality videos of women of various ages and races sharing various types of amorous kisses and making out from various distances in various states of undress. The source data was preprocessed with ffmpeg into the training clips which were each 144 frames long at 24fps showing only the action of interest with no scene cuts or dramatic camera movements. Further they were cropped to show only the women in order to add some aspect ratio variation as 95% of the source was 16:9 before processing.

Training config:

Network dimension: 36
Network alpha: 1
Learning rate: 2.4e-4
Optimizer: came_pytorch.CAME
Optimizer args: weight_decay=0.01, eps=(1e-30,1e-16), betas=(0.9,0.999,0.9999)
Steps: 2400
Warmup steps: 100
Scheduler: Constant with warmup
discrete_flow_shift: 7.0
timestep_sampling: shift
VRAM savings: --blocks_to_swap 31, --split_attn, --flash_attn

Dataset was listed four times in the toml to allow processing different frame bucket lengths at different resolutions:

[general]
caption_extension = ".txt"
enable_bucket = true
bucket_no_upscale = false
[[datasets]]
video_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses"
cache_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses/cache0"
resolution = [480, 272]
target_frames = [129]
frame_extraction = "head"
batch_size = 1
[[datasets]]
video_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses"
cache_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses/cache1"
resolution = [640, 360]
target_frames = [69]
frame_extraction = "uniform"
frame_sample = 2
batch_size = 1
[[datasets]]
video_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses"
cache_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses/cache2"
resolution = [848, 480]
target_frames = [41]
frame_extraction = "uniform"
frame_sample = 2
batch_size = 1
[[datasets]]
video_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses"
cache_directory = "/home/blyss/projects/art/extra/dataset/AmorousLesbianKisses/cache3"
resolution = [1280, 720]
target_frames = [1]
frame_extraction = "uniform"
frame_sample = 2
batch_size = 2

Description

Trained with FP16 base Wan using FP8 scaled mode from Musubi.

Date: 2025-04-16T10:42:43 Title: WanKisses
Resolution: 1280x720 Architecture: wan2.1/lora
Network Dim/Rank: 16.0 Alpha: 16.0 dtype: F16
Module: networks.lora_wan : {'loraplus_lr_ratio': '4'}
Learning Rate (LR): 2e-05
Optimizer: came_pytorch.CAME.CAME(weight_decay=0.01,eps=(1e-30, 1e-16),betas=(0.9, 0.999, 0.9999))
Scheduler: constant_with_warmup Warmup steps: 100
Epoch: 38 Batches per epoch: 64 Gradient accumulation steps: 1
Timestep sampling: Shift Discrete Flow Shift 3.0

FAQ

Comments (22)

JellaiApr 16, 2025

CivitAI

Your listed triggers have "Tongue kissng" (ending in "ng"), but your example has "Tongue kissing" (ending in "ing"). Both versions are in your description text. Which is correct?

blyss

Author

Apr 16, 2025

Oh my bad that's a typo it's just the correct spelling "tongue kissing" I will fix that thanks for calling it to my attention! It was duplicated like that because of copy pasting!

makiaeveliApr 17, 2025· 1 reaction

CivitAI

I've been using your prompts since you dropped Huanyun, am excite

e: signed, sealed, delivered. Worth the wait. If I2V works just as well you're a hero.

After some tests, the I2V works great. I would probably use a slightly lower strength than 1.0 in some cases, as it changes the faces sometimes. But really good (y)

blyss

Author

Apr 17, 2025· 1 reaction

Aww I'm glad to hear that! There isn't a lot of variation in the dataset as far as setting tbh and the videos were cropped close to the subjects of interest. When I captioned I only mentioned the visible parts of the background e.g.

"close up of two women kissing passionately in a doctor's office. The girl on the right is topless and has saliva around her mouth and dripping from her face. The woman on the left is wearing a red shirt. They're both wearing eye makeup and both have dark brunette hair. An anatomy poster can be seen in the blurred background."

"medium shot of two young women sitting on a tan couch while kissing and making out with each other. The asian woman on the left has brown hair and is wearing a lavender colored tube top, the woman on the right has black hair and is wearing a white blouse with a black skirt and heavy mascara."

"wide shot of two nude young women kissing and making out while standing in a living room. The girl on the left is fair skinned with long dark brunette hair in a low ponytail and the girl on the right is tanned with straight dark brunette hair. Behind them a gray couch and some paintings can be seen"

and Wan may have picked up on that more. Oh but speaking of that first prompt, if you like messy kisses (me! I do!) then Wan definitely does that better than Hunyuan. I made some lovely sloppy tongue kisses yesterday with this model! If you want more varied backgrounds, I2V might be the way!

makiaeveliApr 18, 2025

@blyss After running a few more tests, I was able to get more varied backgrounds. My original tests had a bug that always called on the "close up", so every time I mentioned "medium" or "long", it would have both. oops :) (this was essentially the bug: "{long|medium|} close shot")

Also, I2V works quite well. Sometimes the face morphs a bit too much, but all my tests so far have been at full strength.

jalenbrunsonApr 18, 2025· 3 reactions

CivitAI

Awesome!

ComfyTinkerApr 18, 2025

CivitAI

Did the WAN version follow a different training strategy than the HunYuan version?

blyss

Author

Apr 18, 2025

Yep, as I mentioned it was trained at 480x272, 2 unique chunks of 69 frames from each video at 16fps times 30 videos for 2400 steps at 2e-5 with loraplus of 4 using Musubi Tuner. No multiple res or frame buckets, lower LR, network alpha = network dim, fp8 scaled base are notable changes. The dataset and captions were the same, except I removed the "amorous kissing" initial tag and one video. Optimizer was still CAME and everything else should be about the same any specifics just ask!

FWIW I applied this strategy back to Hunyuan last night hoping to improve that version and got ABYSMAL results. The same as when I had initially tried Wan with the strategy I'd used for Hunyuan. The Wan version here is much better than any of my dozen attempts for Hunyuan, my best for that is still the one that's up.

ComfyTinkerApr 18, 2025

@blyss Did you follow the exact same captioning strategy as 3 months ago?

blyss

Author

Apr 18, 2025

@ComfyTinker I mean yeah, they are literally the same txt files I just went into the files with my editor and removed the "amorous kissing," that was at the start of all of them but otherwise it's not just the same strategy it's the identical captions.

hboxgames132May 5, 2025

CivitAI

god bless you for the hunyuan version

blyss

Author

May 6, 2025

Hunyuan version was actually first as I originally made this LoRA before Wan came out! The Wan version is somewhat superior to the Hunyuan version though(mainly for tongue kisses), I've tried again and again to improve the Hunyuan version but this is just the best I can get it Hunyuan REALLY wants to attach their tongues together. Either way, glad to hear you like it!

hboxgames132May 6, 2025

@blyss Oh yeah thanks you !!

ReichessaMay 18, 2025· 4 reactions

CivitAI

SESBIAN LEX

Yambag316May 21, 2025

CivitAI

Nooooooooooo it's gone

dtwburns654Jun 24, 2025· 3 reactions

CivitAI

Impressive

jakoc75648Aug 18, 2025

CivitAI

Very powerful, thank you !

sllnAug 27, 2025· 1 reaction

CivitAI

Any chance of wan2.2 versions?

jakoc75648Sep 13, 2025

Yes please, this lora is the best tbh

blyss

Author

Sep 16, 2025· 6 reactions

Hey, I've actually been working on this one so far but it's not quite there. It takes me 16 hours to do a full Wan 2.2 train on my hardware so iterating is slow. But I do intend to bring my models to Wan 2.2!

honryindianNov 30, 2025

@blyss Do you still plan to do it?

jakoc75648Feb 14, 2026

CivitAI

Still the best kissing lora

LORA

Wan Video 14B t2v

by blyss

Download (Beta) View on CivitAI

concept

women

kissing