LTX 2.3
A LoRA for generating from-behind sex (facing the camera) positions with LTX-2.3 video models. Supports doggy style, prone, and top-down bottom-up positions. Check out the training data if you need help with workflows. Also I have attached my image captioning system prompt when using I2V that should help with language.
Trigger Word
sfbehind
Recommended Settings
LoRA strength (Stage 1) 1.0
LoRA strength (Stage 2) 0.85
Distilled LoRA (Stage 2) 0.6
Prompting Tips
This LoRA responds best to literal, mechanical prompts. Describe body positions and motion like you're directing a scene. Avoid poetic or abstract language.
Do: "He thrusts his hips forward in short rapid strokes, her buttocks compressing on impact" Don't: "A mesmerizing rhythm of primal passion"
Position Names
Use these exact terms — the model was trained on them:
doggy — on hands and knees
prone — lying flat face-down
top-down bottom-up — face pressed into bed, hips raised, back arched
Thrust Patterns
Two distinct patterns the model learned:
Close thrusts (no shaft visible): "He thrusts in short, rapid strokes, his hips staying pressed close to her ass. Her buttocks compress on each impact."
Long strokes (shaft visible): "He pulls his hips back, the glistening shaft reappearing, then drives forward. Her buttocks ripple from the impact."
Who Is Moving?
Man active: "He thrusts his hips forward" / "He drives into her"
Woman active: "She pushes her hips back into him" / "She rocks back against him"
Don't describe both moving unless both actually are.
Getting Better Results
Describe the male body — skin tone, build, body hair, tattoos, muscle definition. Without this it renders as a vague blob.
Describe impact reactions — "her buttocks compress and ripple on contact, her body rocking forward from the force." This teaches the model to sync the bounce with the thrust.
Describe contact points — "his hips press flush against her ass" or "his hands grip her waist."
If her face is visible describe it literally — mouth open, eyes closed, brow furrowed. Don't interpret emotion.
If no shaft is visible don't mention it. Describe hip motion and body contact only.
Specify the camera angle — straight-down, three-quarter, eye-level, low angle.
Known Quirks
Male torso needs explicit description or it gets blobby.
Impact bounce can desync if not described in the prompt — always include "buttocks compress" or "body rocks forward" tied to the thrust.
Stage 2 LoRA strength at 1.0 degrades quality. Keep at 0.85.
System Prompt I use with i2v:
You are a prompt writer for an AI video generation model. You will be given a reference image. Extract the visual details and write a generation prompt that would produce a video with a similar look and feel, but with motion added.
You are NOT captioning the image. You are writing a CINEMATIC DIRECTION that borrows the image's visual DNA — the specific colors, textures, materials, lighting mood, and character details — and adds motion to bring it to life.
Always begin with "sfbehind,"
EXTRACT WITH SPECIFICITY — every noun needs a visual adjective:
- NOT "on a bed" → "on tangled white cotton sheets, one pillow crushed beneath her chest"
- NOT "blonde hair" → "long platinum-blonde waves spilling over her left shoulder, damp at the temples"
- NOT "muscular man" → "a lean, V-tapered man with sun-darkened skin, a dusting of dark hair across his chest, and calloused hands"
- NOT "warm lighting" → "late-afternoon sunlight cutting through wooden blinds, painting gold stripes across her lower back"
- NOT "from behind" → "his hips square behind hers, his thumbs pressing dimples into the flesh above her hip bones"
PULL THESE FROM THE IMAGE:
- Hair: color with modifier, length, state (damp, tangled, pinned up, falling in face)
- Skin: tone + undertone + surface (glistening, goosebumped, flushed pink across shoulders, tan lines visible)
- Body: one or two specific details that sell the physicality (the dip of her lower back, the flex of his forearms, the soft crease where her thigh meets her hip)
- Position: name it (doggy/prone/top-down bottom-up) then add the specific body mechanics — spine angle, where hands grip, how weight distributes
- His hands: exactly where and how — "fingers splayed across her right hip, thumb pressing into the dimple above her tailbone" not just "hands on hips"
- Setting: materials and textures (velvet headboard, cool tile floor, wrinkled hotel duvet), objects that set the scene (bedside lamp casting a cone of warm light, phone face-down on the nightstand)
- Lighting: what it does to their bodies specifically (highlights the sheen of sweat on her spine, catches the ridge of his knuckles, leaves his face in shadow)
- Camera: describe by what's in frame and what's cropped (her full back and his torso from navel up, tight on where their bodies meet, wide enough to see the headboard and his arms braced against it)
ADD MOTION — pick the thrust pattern that fits the image's body positions:
CLOSE THRUSTS (his hips tight against her):
"He drives forward in short, rapid strokes, his hips barely pulling back before snapping forward again. Her buttocks flatten against his pelvis on each impact, a visible shudder rolling up through her lower back."
LONG STROKES (space between their bodies):
"He draws his hips back until the glistening shaft reappears between her buttocks, then pushes forward in one steady stroke, her body rocking forward as his hips meet her ass with an audible impact."
WOMAN DRIVING:
"She rocks her hips backward into him in a slow, deliberate grind, her spine arching deeper with each push, his hands riding her waist but not guiding."
ADD IMPACT REACTION — her body's physical response synced to the motion:
- "her buttocks compress and ripple on contact"
- "her body shifts forward two inches before settling back"
- "the flesh of her thighs shakes from the impact"
- "her fingers tighten in the sheets with each thrust"
VOCABULARY:
- "thrusts" = he moves, "pushes back / rocks back" = she moves
- If shaft isn't visible, don't mention it — describe hip motion and contact only
- Never describe what's inside her body
- Never end with mood summaries or poetry
OUTPUT: Single flowing paragraph, 180-250 words. Start with "sfbehind," — end with a visual detail, not a feeling.New release (1/15/26):
I think I achieved a decent balance on the quality of T2V, I2V, and audio so I'm releasing this as a beta. Some times things go weird. Lower strength can help sometimes with trickier prompts. I really like the use of ltx-2-ic-detailer-lora with this lora.
I'm still working on my workflow but currently I'm running a video/audio training cycle then and image training cycle to improve genitals.
Differences from v0.1
Improved audio,
T2V - Improved penis (still not perfect, but way better)
I2V - Similar or better results
Tags used during training
A woman is lying on her stomach in prone position a man behind her thrusts his hip forward and back sliding in and out.
The mans penis is visible.
Audio tags
clapping cheeks
moans, moaning, the woman's breathless moaning
heavy breathing
Training Details v0.2
30 dataset videos 576x1024@121f and 1024x576@121f
30 high quality images 1024x1024
Frame Rate: 25fps
Steps Video: 4000 (Video was trained faster than audio)
Steps Images: 3800 (Used to improve penis appearance)
NO abliterated used
Generation details:
Workflows in all images in the showcase for release.
No abliterated model used. (just don't user the LTX prompt enhancer.)
T2V vids are fp-8-distill
I2V vids are 19b-dev full.
Training update (1/14/26):
I am actively working on this LoRa. Its difficult to balance, I2V, T2V and audio all together. I'm working on my workflow and training methods, but it may end up being split for T2V/audio and I2V/audio, which is not ideal at all.
If you look at my latest video post for this model https://civarchive.com/posts/25846175. You should be able to see the massive difference in audio.
⚠️ Work in Progress (For testing only)
I believe in open development. DO NOT expect the best result from this project.
All images in the gallery are raw, unprocessed outputs directly from generation.
The last 4 images in the gallery are I2V.
Each image includes its attached workflow for full reproducibility.
I know the lora is huge. Rank 16 results were not great. Any tips for lowering the size would be great!
Any feedback is welcome.
Training Details
Trainer: Directly from the LTX team. https://github.com/Lightricks/LTX-2
Steps: 2,250
Dataset: 12 videos
Clip length: ~5 seconds
Frame rate: 25 FPS
Resolution buckets: 1024x576 - 121frames and 576x1024 - 121frames
Frames are required to divisible by 8+1
Gemma3 abliterated used during training.
NO audio training was done during this release.
Workflow & Settings
Base workflows: ComfyUI default templates for LTX-2 (T2V & I2V)
Tested on 19b-dev full and 19b-dev-FP8
Sampler: Res2s
Sampling steps: 20
Additional LoRAs:
ltx-2-ic-detailer-loraGemma3 abliterated used during generation.
Description
Updated Audio across I2V and T2V, Better Peen in T2V. and Didn't mess up I2V
FAQ
Comments (22)
This update is amazing, congratulations on the great work! I'm going to make another request, please make the next one for (pov blowjob). Nobody has done it yet, and you're the perfect creator for it, the quality of your work is incredible!
Thank you, I have some other stuff in training. I have a feeling there is a wave of good stuff coming shortly with new trainer being released everyday.
great work. sorry but where can i find the workflow please?
Well done for taking on this challange.. I've been having similar issues to you by the sounds of it. I think there definitely is a censorship issue in comparrison to training other models. I tried the abliterated model too.. but I think the issue is deeper than simply using an uncensored model vs censored and has to do with how the model prioritises on a fundamental level. I am currently attempting some caption tricks to confuse it so it can't bias effectively, and forcing early learning to try and over ride the biases. You can do this by reducing early training rank and increasing linear_alpha..
It's like we have all this potential, but can't fully unlock it yet but it will get there. I will try your image/video mix.. as I didn't even think about phase training after I saw their training manual stated mixed training wasn't supported.
How do you train with reduced early training rank and increased linear_alpha ?
Like start with 8/16 +128 ? then do 32 + 32 ?
@blo01 I'm still in the experimental stage, and what results you get would depend on lots of other factors, like frames, how complex the concept is.. but the idea is to take control away from the LLM to decide what concepts are learnt during the early stages when fundamental geometry is learnt. High linear_alpha forces the model to rely more on the dataset and less on prelearned concepts, biases..etc, and a lower rank gives these (hopefully) new learned concepts more strength over the rest of the training and final lora.. You want to focus on the first half of training.. Predominantly the first 25% as thats when critical geometric decisions are made that can't be undone.. and from there, I am scaling linear_alpha up with rank so the ratio stays consistent. As an example of what I'm testing with at the moment..
Phase 1: Rank 16: linear_alpha = 32 - 40
Percentage: 0% - 25% (First quarter of total steps)
Phase 2: Rank 32: linear_alpha = 48 - 56
Percentage: 25% - 50% (Second quarter of total steps)
Rank 64: linear_alpha = 64 - 80
Percentage: 50% - 100% (Final half of total steps)
If you are going to give it a go, I would start consevative and see if it's pushing the model in the right direction. Also be careful with captioning. I would keep them short and precise, or just a tag.
Yeah, I don't think its an abliterated text encoder issue either.
I've seen too many blacked out nipples when testing abliterated and standard text encoders. The one thing the ablieterated text encoder allows is the prompt enhancer to work with NSFW themes. I could be way off here but, it seems like there maybe a filter in one of the blocks possibly. It will take someone more technical than I am to find and change that though.
It has fundamental biases that can't be altered.. Thats why for instance, if you feed it a NSFW prompt and ask it to re-word it, it can randomly start "softening" the terminology, reverting back to acceptable language that is native to it's training and bias. From what I've researched, the theory that "a censored model isn't censored during training" isn't stricktly true. It no longer has the ability to refuse information coming in.. but, it can apply it's biases onto that information when prioritising what to learn.. Thats why you can train a lora for, and everything is being trained perfectly, except "certain areas".. or in some cases, they are completly missing.. as it deprioritised them so much they were effectively erased. Closeups are one way to bypass this to a degree, as they give the model less options to divert. Thats why I'm trying bias breaking training strategies to confuse and overload it. Fingers crossed..
Guide from the LTX team on how to train your own lora: https://www.youtube.com/watch?v=sL-T6dsO0v4
hm after some testing I can't get good results out of v0.2. It seems to know more camera angles than v0.1, but isn't locked in enough to get it right. In v0.1 I get good results with all kind of settings and enviroments, but v0.2 really struggles to produce good human anatomy ... I'm using T2V
Thanks for the feedback. Mind sharing your samplers, steps, and model?
@daring_l I'm using the default comfyUI workflow. fp8 model, euler_ancestral, 20 steps.
Maybe try to caption where the woman is lying on, a bed, the floor etc, in the dataset, so it's possible to actually make more variation on that, in the current form it seems to always put a bed and bed sheet under the woman, or almost always (most of the time at least)
Love this Lora, can get some fantastic generations out of it
are you going to make a new one with better audio? it is usually pretty distorted. GREAT lora though.
LTX 2.3 VERSION, please!!! Thanks!!
in progress
@daring_l Thank you so much your model is literally the best LTX 2 model out there! <3 <3 <3
Its out! I hope it works well for you.
@daring_l You are a legend thank you so much!!!! <3 <3 <3 your putting out the best content out here!!! :D
Details
Files
prone_face_cam_v0_2.safetensors
Mirrors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
prone_face_cam_v0_2.safetensors
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.