LTXV-2.3 - Audio only - Clapping Cheeks - v0.0.1-alpha

NSFW

🛑Work in progress🛑

(Alpha release) I'm not sure this will be interesting to anyone.

WORKFLOW: https://civarchive.com/models/2516563/wan-with-ltxv-23-audio
Not designed for oral sex
- I tried nothing more confusing or disturbing than hearing "gawk gawk" or gagging in an anal video.
- Check out my deepthroat lora it may work for adding audio, confirmed to work.
- If a 1GB lora is to much I may spend sometime to create a lightweight BJ audio lora.

Create sex audio for previously created videos or in addition to LoRAs that lack audio. Three main additions to the base model: clapping cheeks, improved moaning/heavy breathing, and wetness sounds.

This is a purely experimental LoRa addressing a common gap in many videos. It uses video-to-audio cross-attention to generate audio, meaning text prompts aren't critical but can still provide influence.

Tags used

- skin slapping against skin 
- clapping cheeks
- wet vagina
- The woman moans
- The woman is breathing heavy

Extra Information

I've tested with dev and distill the best results are from Dev.

Best Samplers I've found - res_2s, er_sde
Audio will sync to visual movement naturally

LoRa Creator info

Stand out info

Rank 16 (might be a little to small)
--lora_target_preset full for cross-attention
-ltx2_mode av
Separate audio learn rate

accelerate launch --num_cpu_threads_per_process 8 --mixed_precision bf16 \
  ltx2_train_network.py --sdpa \
  --ltx2_checkpoint /ai/comfyui/models/checkpoints/ltx-2.3-22b-dev.safetensors \
  --dataset_config ~/datasets/sex-audio/ltx_dataset_config.toml \
  --mixed_precision bf16 \
  --optimizer_type adamw8bit \
  --learning_rate 5e-5 \
  --gradient_checkpointing \
  --max_data_loader_n_workers 8 \
  --persistent_data_loader_workers \
  --network_module networks.lora_ltx2 \
  --network_dim 16 --network_alpha 16 \
  --timestep_sampling shifted_logit_normal \
  --discrete_flow_shift 1.0 \
  --max_train_steps 5000 --lr_scheduler constant --audio_lr 2.5e-5 \
  --max_grad_norm 1.0 \
  --save_every_n_steps 250 \
  --seed 42 \
  --logging_dir /ai/datasets/sex-audio/logs \
  --output_dir /ai/comfyui/models/loras/LTX2.3/sex-audio \
  --output_name sex-audio \
  --ltx2_first_frame_conditioning_p 1.0 \
  --caption_dropout_rate 0.1 --lora_target_preset full --ltx2_mode av

Description

Super early concept

FAQ

Comments (15)

JellaiApr 3, 2026· 2 reactions

CivitAI

Really interesting idea. I look forward to it being developed further.

iluvlamiaApr 3, 2026· 2 reactions

CivitAI

van you share Dataset, how many audio you used

daring_l

Author

Apr 3, 2026· 1 reaction

I added some details of the training to the model card. Videos were used so the cross-attention and lip-sync properly. Adding more audio only will mostly likely be another step in the future

- 12 videos /w audio used.

____NULL____Apr 3, 2026· 10 reactions

CivitAI

I think this lora is the first of its kind. Great work on it! I hope more like this get created! It's very high quality.
Congrats again on being the first 🥂🥂

moistclamm121Apr 3, 2026· 2 reactions

CivitAI

I'm afraid you cooked

iluvlamiaApr 3, 2026· 4 reactions

CivitAI

so is it possible to train voice only character lora? only use audio input, only train audio related blocks

JellaiApr 3, 2026· 1 reaction

Yeah, I've been wondering about training other voice audio, like accents. It would be great if we could only train the audio side. Would save time and make the dataset easier.

daring_l

Author

Apr 3, 2026· 2 reactions

I'm using a musubi fork, https://github.com/AkaneTendo25/musubi-tuner, there is a mode for audio only.

JellaiApr 3, 2026· 1 reaction

@daring_l You used it to add audio to existing video. Is it supposed to also train video generations to use the audio? Like, is it designed to support training things like character voices and accents for regular video generation? Or is it designed only for what you use it for?

daring_l

Author

Apr 3, 2026· 1 reaction

@Jellai Yes, this lora can absolutely be used during video generation. I would use the KJNode node called "LTX2 LoRA Loader Advanced" reduce the video layer to 0 so it doesn't interfere with generation. I should create a couple examples of that.

I think accents should be trainable. Here is an example of cloning the whole voice and accent, Kermit the frog lora, https://civitai.com/models/2484746/kermit-the-frog-ltx-23?modelVersionId=2803752. You would just be removing the video part from the training if you want audio only.

jackbin330888Apr 3, 2026· 3 reactions

CivitAI

This nicely solves the integration between LTX 2.3 and Wan 2.2. Looking forward to more of your work—thank you very much!

kronos1959777Apr 3, 2026· 3 reactions

CivitAI

You do what few men have dared to try.

Next lora could be sucking wet sloppy noises?

Thanks.

daring_l

Author

Apr 4, 2026· 1 reaction

I did some testing you can get blowjob sounds with the use of my deepthroat LoRA, https://civitai.com/models/2476698/ltx-23-deepthroat. And the workflow I just posted.

https://civitai.com/models/2516563/wan-with-ltxv-23-audio

TheLastRemainApr 21, 2026

CivitAI

Very nice, but is it me and is the female voice in all clips the same ?

daring_l

Author

Apr 21, 2026· 1 reaction

It shouldn't be but i do need to update this lora with some new concepts that Im working on. I'll test it out.

LORA

LTXV 2.3

by daring_l

Download (Beta) View on CivitAI

style

Details

Downloads

2,612

Platform

CivitAI

Platform Status

Available

Created

4/3/2026

Updated

6/15/2026

Deleted

Trigger Words:

skin slapping against skin

Files

clapping-cheeks-add-change-audio.zip

Size:

10.94 KB

SHA256:

a13beb8e021536198f17fb0b834a129d6c3f61248230ed9f29736a32068f23e4

Mirrors

HuggingFace (4 mirrors)

clapping-cheeks-add-change-audio.zip

CivitAI (1 mirrors)

clapping-cheeks-add-change-audio.zip

clapping-cheeks-audio-v001-alpha.safetensors

Size:

321.73 MB

SHA256:

9ca4076bffe63aa293f349c28c4d767709e1d004a0d7a046b131dd1d5fef3349

Mirrors

HuggingFace (16 mirrors)

clapping-cheeks-audio-v001-alpha.safetensors

clapping-cheeks-audio-v001-alpha-mid_2514206-vid_2825999.safetensors

Audio only - Clapping Cheeks v0.0.1 - LTX2.3 - skin slapping against skin,clapping cheeks,wet vagina,the woman moans,the woman is breathing heavy.safetensors

clapping-cheeks-audio-v001-alpha.safetensors

ltxv-2.3-audio-only-clapping-cheeks.safetensors

Audio only - Clapping Cheeks v0.0.1 - LTX2.3 - skin slapping against skin,clapping cheeks,wet vagina,the woman moans,the woman is breathing heavy.safetensors

clapping-cheeks-audio-v001-alpha-mid_2514206-vid_2825999.safetensors

CivitAI (1 mirrors)

clapping-cheeks-audio-v001-alpha.safetensors

🛑Work in progress🛑

(Alpha release) I'm not sure this will be interesting to anyone.

Tags used

Extra Information

LoRa Creator info

Description

FAQ

What is LTXV-2.3 - Audio only - Clapping Cheeks?

How do I use LTXV-2.3 - Audio only - Clapping Cheeks?

Why might this LoRA not be producing the expected results?

Can I use this LoRA commercially?

What files are available and where can I download them?

Comments (15)

Details

Files

clapping-cheeks-add-change-audio.zip

Mirrors

clapping-cheeks-audio-v001-alpha.safetensors

Mirrors