LTX 2.3 - I2V T2V Video Reasoning lora VBVR

LTX 2.3 - I2V T2V Video Reasoning lora VBVR - v3.0 I2V motion

NSFW

This was the first VBVR lora trained for LTX 2.3

You can make money with my models. Sell what you generate, use it in paid work, run it on paid services, merge it, sell the merge all of it, no permission needed. Credit is appreciated but never required. New releases might have a short early access window behind Buzz, then go free and free is permanent. Once a model is free, I will never re-paywall it or pull it for a paid platform, and you're welcome to mirror, archive, and rehost it so it stays free no matter what happens to any one site.

Feel free to mirror any of my loras to HuggingFace

V4 has been trained on Sulphur 2 Base and was trained on 7000 videos with rank increased to 128 to give it more capacity to be able to improve prompt following. Using the distilled 1.1 lora at a strength of 0.5 in wan2gp you can crank steps all the way to 50 which can improve quality of gens. Not required though like it was not required in V3 or below but a nice option to maximize quality if you want. I usually do 10-12 steps. Videos are generated using 10eros since its the version of Sulphur optimized for i2v.

Wall of text with a bunch of info below

What changed in V3: Attention-only layers. The feedforward layers have been stripped, leaving only the attention weights. It seems like the prompt following and reasoning behavior most likely live in the attention layers, while the feedforward layers were potentially interfering with natural motion, likely by over-learning features like textures and style from the training data.

Better motion, smaller file size.

A LoRA that improves prompt following, temporal consistency, and motion "precision" for LTX 2.3. Reduces the floaty, drifty motion that LTX tends to add to scenes. Things that should move, move with purpose. Things that shouldn't move, move less. Also works on non-NSFW, non-Furry, realistic, animated etc. It responds well to detailed prompts.

In comfyui or wan2gp lowering image strength to 0.85 can improve motion in general if you want more motion

Feedback and A-B comparisons welcome. V2 and V1 was trained on 4800 videos.

Recommended to run at strength 1.0-0.7 but experiment to find what works best for your setup. If you want stronger prompt adherence try strength 1.5-2.0. I have noticed the only side effect I have gotten from a high strength is the video looking like its 16fps. I have not seen the choppiness issues on i2v unless the lora is cranked to 1.5-2.0 so it may be a t2v thing.

Prompting tips in non-nsfw terms so its less confusing just adapt it to nsfw:

Be specific and literal. Describe what happens, in what order, step by step.

Instead of "a ball bouncing around" → "A red ball moves to the right, bounces off the wall, and returns to the center"

Instead of "fluid pouring" → "Water flows from the left container through the connecting tube into the right container until both levels are equal"

Describe the starting state, the action, and the end state

The LoRA follows prompts more literally than base LTX — precise prompts will give much better results

How was it made?

v0.1 and v0.2 were trained on 360 videos from the VBVR (Very Big Video Reasoning) dataset synthetic task videos where every motion is precise and intentional. No concept bleed, no style change, just tighter control.

Based on the paper "A Very Big Video Reasoning Suite" which demonstrated this approach on Wan 2.2. I noticed that lora helped prompt following and temporal consistency a ton with wan so I am training this version for LTX.

What does it actually do?

Prompt following is more faithful — the model does more of what you asked instead of improvising

Motion is more deliberate and less erratic

Reduces random drift and wobble in scenes

Temporal consistency improved — actions follow logical sequences

What it doesn't do:

Doesn't change visual style

Doesn't add or remove capabilities LTX doesn't already have

Not a motion LoRA — stacks with motion LoRA's

Training details for v0.1 and v0.2 (if you give a shit)

Rank 32

360 VBVR synthetic videos at 512x512, 81 frames <------Alot less than 1 million but still a shitload to train on this is very slow to train locally.

LR 1e-4, adamw8bit

Training details for V1

Training videos were increased to 4800

Resolution is the same but frames were increased to 121

Every other setting the same as v0.1 and v0.2

More training data from the VBVR dataset was added to v1

Below is the new dataset I trained on's data composition if your curious

Tier 1 — Physics and Motion (3,400 samples)

Core generators at 300 each: `G-11` (object reappearance) has a shape move off-screen in a direction and return along the same path — teaches trajectory and object persistence. `G-25` (separate object spinning) is a shape that rotates in place then translates horizontally to a target position — multi-step motion sequencing. `G-33` (visual jenga) is a stack of objects that get removed one by one from top to bottom — sequential extraction with implicit physics ordering. `O-29` (ballcolor) is ball tracking tasks with color — motion following plus identity preservation. `O-52` (traffic light) is discrete state transitions, lights switching on/off between green and gray — teaches the model that state changes are crisp, not gradual. `O-75` (communicating vessels) is fluid equalizing between connected tubes based on pressure — continuous physics simulation over time. `O-87` (fluid diffusion) is ink spreading in water — another continuous physical transformation but with expansion rather than equalization.

New additions at 250 each: `G-35` (hit target after bounce) is a ball with an initial direction that bounces off walls following reflection laws to hit a target — pure trajectory prediction with physics constraints. `O-30` (bookshelf) is book rearrangement on shelves — the specific task VBVR highlighted where their model beat Sora 2.

Multi-step transforms at 160 each: `O-7` (shape color change) is a single transformation — shape changes from one color to another. `O-8` (shape rotation) is a shape rotating by a specific angle. `O-13` (outline then move) is two sequential steps: change a shape's outline style, then move it to a new position. `O-14` (scale then outline) is also two steps: scale a shape up or down, then change its outline. These four together teach the model that instructions are ordered and each step completes before the next begins.

Tier 2 — Spatial and Reasoning (1,420 samples)

Proven generators at 100 each: `G-13` (grid number sequence) is filling in number patterns on a grid. `G-17` (grid avoid red block) is pathfinding on a grid while avoiding obstacles. `G-31` (directed graph navigation) is finding the shortest path through a directed graph. `G-41` (grid highest cost) is evaluating spatial values on a grid to find the optimal path. `O-24` (domino chain) is a sequential cascade where dominoes fall until they hit a gap — teaches causal chains and stopping conditions. `O-34` (dot to dot) is connecting numbered dots in sequence — ordered drawing. `O-47` (sliding puzzle) is tile rearrangement under constraints, like a 15-puzzle. `O-83` (planar warp) is warping a grid to align with a target quadrilateral — geometric transformation.

New reasoning diversity at 130 each: `O-1` (color mixing) is RGB additive mixing where two light sources combine and the result fills a target zone — rule-based continuous process. `O-33` (counting objects) is exactly what it sounds like — count things correctly. `G-3` (stable sort) is arranging objects by a rule while preserving relative order. `G-37` (symmetry random) is completing a pattern by mirroring across an axis. `O-21` (construction blueprint) is fitting a correct puzzle piece into a gap in a structure. `G-44` (BFS) is breadth-first search traversal of a graph — systematic layer-by-layer exploration.

The overall dataset is weighted roughly 70/30 toward physical motion and transformation tasks over abstract spatial reasoning, All of these are taken from the VBVR dataset I am not the creator of the dataset. I'm pretty new to lora training so if you have tips let me know.

REMEMBER its not X, its Y.

Disclaimer & Terms of Use

This model is provided "AS IS", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and noninfringement.

IN NO EVENT SHALL MisticRain69 BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL OR ANY OUTPUT GENERATED WITH IT.

18+ only. You, the user, are solely responsible for anything you generate with this model and for ensuring your use complies with all applicable laws and the CivitAI Terms of Service.

You may not:

produce anything illegal, non-consensual, defamatory, or used for harassment, deception, or fraud;

create any content depicting, or appearing to depict, minors (real or fictional) in a sexual context — no exceptions, ever;

use this model to depict any real, identifiable person — celebrity, public figure, or private individual — in any context, SFW or NSFW. No likeness use, period.

If you post outputs to the gallery that violate the CivitAI TOS, you'll be blocked and reported. Don't do this shit. Don't be fucking sus.

Description

New version optimized for enhancing motion

FAQ

Comments (53)

kronos1959777Apr 11, 2026

CivitAI

amazing. wish you made her talk with my lora :D nice sample

MisticRain69

Author

Apr 11, 2026

You should try combining them

bennyboy_77Apr 11, 2026· 3 reactions

CivitAI

I've had some degree of success with your previous versions of this lora - but version 3.0 for image to video is a major step forward. A high percentage of my videos normally end up with head reversal moments (like something from The Exorcist) when characters turn around. So far, this version of the lora seems to have completely fixed that issue (running at 0.8 strength alongside other loras). What can I say? Amazing work :-D

RLS_AnimationApr 14, 2026· 1 reaction

Thanks for providing the strength. I look forward to trying it out!

gackt2Apr 11, 2026

CivitAI

I downloaded a version called
LTX2.3_VBVR_Reasoning_I2V_V2.safetensors
Should this or that one be kept?

MisticRain69

Author

Apr 11, 2026

Try the latest version it should be a improvement over v2

TheEllieXXXApr 12, 2026· 3 reactions

CivitAI

This lora is amazing, i tried it on non usual images and it worked perfect, looks like i can animate anything now!

Thx you so much

amrielApr 12, 2026

CivitAI

I noticed an issue: in longer clips, v2 and v3 gradually lose contrast and look washed out over time.

v1 doesn’t have this problem. How can I fix it?

phazeiApr 12, 2026· 4 reactions

CivitAI

How does this compare to https://huggingface.co/LiconStudio/Ltx2.3-VBVR-lora-I2V/tree/main

snake88Apr 13, 2026· 1 reaction

I have same question, I have a feeling maybe V4 (for I2V, I guess V2 for T2V) should be based on taking what you linked and then using the NSFW training as he did to fine tune it from that point maybe will have stronger generalization.

OTAKKxApr 13, 2026· 6 reactions

CivitAI

This is one I'll definitely wait on... I was using V1, then V2 came around, i was gonna start using V2, and now V3 is here... might as well wait for a V5 before settling down with this lora.

dumdidumApr 14, 2026· 25 reactions

CivitAI

Why the sample video is p*RN and the gallery is filled with p"RN when this is a technical lora ????

MisticRain69

Author

Apr 14, 2026· 7 reactions

Its marked as NSFW and Furry if you don't want to see NSFW content disable it in settings.

dumdidumApr 14, 2026

@MisticRain69 No worries about the NSFW stuff! I just meant I was looking for examples of non-organic physics,, you know, cubes, spheres, or objects following paths. I'm more interested in the mechanical logic side of the LoRA than the 'anatomical' simulations, haha

MisticRain69

Author

Apr 14, 2026· 3 reactions

@dumdidum Oh it should handle that equally since its trained on a subset of the VBVR dataset.

skyrimer3dApr 16, 2026· 8 reactions

porn brings the monies

pork69May 4, 2026

porn is the future

Kendos_RetreatApr 18, 2026

CivitAI

Could you please share a workflow? I cant get mine to look like yours. -_-

MisticRain69

Author

Apr 18, 2026· 2 reactions

use wan2gp I stopped using comfyui because no matter what I did ltx looked meh. Swapped to wan2gp and now gens look great. Just download it and drag and drop one of the vids into the spot that says import settings from video or it says something really similar.

Kendos_RetreatApr 18, 2026

@MisticRain69 Thanks for the info!

Kendos_RetreatApr 18, 2026

@MisticRain69 Where do you get your wan2gp?

freestuffpl0x42069Apr 24, 2026

@Kendos_Retreat pinokio is decent if you aren't technical

ang9911530825Apr 24, 2026

@MisticRain69 Whaa I've been using Comfyui-ezi-desktop. I only have 10gb gpu with 64 ddr4 ram, keep getting an error on ksampler, I'm still learning had, I need a walkthrough tutorial or discord. I'm gonna look into wan2gp

matriksAiApr 27, 2026

@MisticRain69 Can you pelase say i use pinokio and usethere wan2gp can you pelase say where to put checkpoints ,loras and audio files so it can generate like in your examples

ang9911530825Apr 24, 2026

CivitAI

I'm new to this, and really want to get this working but have no idea what i'm doing wrong. I've tried looking through reddit or anywhere for help or a walkthrough but nothing. PLZ someone bless me with the power bring giggle physics to i2v

fatberg_slimApr 24, 2026· 3 reactions

CivitAI

Using this really makes a difference. Great work. Thanks for sharing.

lse14Apr 26, 2026

CivitAI

Is the audio of the cover video generated directly by LTX or by other programs?

MisticRain69

Author

Apr 26, 2026· 1 reaction

LTX

lse14Apr 27, 2026

@MisticRain69 Thank the info.

By the way, what is the vram and ram configuration you used to generate the cover video?

When I imported the parameters of your video into wan2gp, the vram and ram kept exploding (my configuration is 4090+128g RAM).

Are there any optimization settings that can make the program work?

MisticRain69

Author

Apr 30, 2026

@lse14 huh weird I use a 3090 and 128g DDR4 I run the int8 model in wan2gp

lse14May 4, 2026

@MisticRain69 I can now use the gguf Q6 model to work properly, and the generation speed is very fast

heyoaiApr 28, 2026

CivitAI

Would you consider updating this to the LTX 2.3 1.1 update? thank you for making this <3

MisticRain69

Author

Apr 28, 2026· 1 reaction

The 1.1 update is pretty much just a bugfix for the distilled model it works the same on the distilled 1.1 version as it does on the dev an 1.0 distill version.

ThatBenderGuyMay 3, 2026

CivitAI

I'm new to LTX 2.3, does this go with the ltx-2.3-22b-distilled-loa-384.1.1 or does it replace it?

MisticRain69

Author

May 3, 2026· 1 reaction

Goes with it.

sirmonstercock7716Jun 9, 2026

it goes before it...at least thats how i got it to work..

hatt2May 6, 2026· 4 reactions

CivitAI

i feel like your demo movies aren't doing justice to your hard work and what the usefulness of a reasoning model can be. Perhaps a video of a pool table and showing the balls reacting to being struck like on the hugging face page would be more clear? Just a suggestion.

MisticRain69

Author

May 7, 2026

Which huggingface page I don't have it listed on huggingface

boobkake22May 7, 2026

I agree with this, for a ultility this broad, having a more "broad audience" example for the "poster" video would be a smart call.

hatt2May 8, 2026

@MisticRain69 I just mean in general. The reasoning model has a lot of use cases outside of bouncing breasts. From what I can gather, things like simulating water flow etc etc are potentially possible from this lora.

I learned about it originally here. The hugging face page URL is in there.

https://www.reddit.com/r/comfyui/comments/1sjd4cs/community_members_from_china_have_released_a_new/

wigwoo1May 8, 2026

CivitAI

Where can I get the ComfyUI Workflow?

MisticRain69

Author

May 14, 2026

Just linked it.

suschie48199Jun 20, 2026

@MisticRain69 just linked it ? need workflow too i dont understand "Just linked it" were ?? :(

betech79209Jun 26, 2026· 1 reaction

Use the RuneXX workflows on HF and tweak them with the models, encoders and LoRA's. No custom workflows are needed for this. If you can't comprehend the basics of ComfyUI, you might want to stick with paid video services instead.

OrangeJuiceAlienMay 14, 2026· 1 reaction

CivitAI

can v3 also be used with T2V or is v1 still best for that?

MisticRain69

Author

May 14, 2026

Im not sure i'd just go ahead and test them both and see which works better

jb03031993374Jun 1, 2026

CivitAI

Good shit. Does indeed appear to reduce wacky motion errors. Noticeably less warps and broken joints. This doesn't necessarily extend "domain knowledge" if you get what I mean but it seems to play well with other loras that gets you motion in all the special places.

MisticRain69

Author

Jun 11, 2026· 10 reactions

CivitAI

V4 is now out

quentar82Jun 12, 2026

working with distilled 1.1? or only with sulphur base?

MisticRain69

Author

Jun 12, 2026

@quentar82 Only tested on sulphur base.

boobkake22Jun 12, 2026

Did you find there was particular need for supporting Sulphur specifically?

AginoJun 12, 2026

@quentar82 @MisticRain69 I was going to ask the same question since I think distilled follows the prompt better

quentar82Jun 12, 2026

@Agino you can still use the v3 for the distilled1.1

LORA

LTXV 2.3

by MisticRain69

Download (Beta) View on CivitAI

Details

Downloads

16,776

Platform

CivitAI

Platform Status

Available

Created

4/11/2026

Updated

7/28/2026

Deleted

Files

LTX2.3_reasoning_I2V_V3.safetensors

Size:

462.96 MB

SHA256:

150ec404e517562a8e0692f4a9401c740cc0199793dc608d168490ba2d958203

Mirrors

HuggingFace (59 mirrors)

LTX2.3_reasoning_I2V_V3.safetensors

d865761a-7363-4021-a209-3d2e0c7958d9-LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_Thinking_V3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

2497207_LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

LTX23_Reasoning_v3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

rsnng3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

4.safetensors

2497207_LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_VBVR_V3.0.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

I2V T2V Video Reasoning lora v3 - LTX2.3.safetensors

Video Reasoning v3.0 I2V - LTX2.3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_reasoning_I2V_V3-mid_2497207-vid_2848299.safetensors

LTX2.3_VBVR_V3.0.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

I2V T2V Video Reasoning lora v3 - LTX2.3.safetensors

Video Reasoning v3.0 I2V - LTX2.3.safetensors

ltx23-vbvr-v30.safetensors

ltx-2.3-i2v-t2v-video-reasoning-lora-vbvr.safetensors

Video Reasoning v3.0 I2V - LTX2.3.safetensors

I2V T2V Video Reasoning lora v3 - LTX2.3.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_reasoning_I2V_V3-mid_2497207-vid_2848299.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

LTX2.3_VBVR_V3.0.safetensors

LTX2.3_reasoning_I2V_V3.safetensors

CivitAI (1 mirrors)

LTX2.3_reasoning_I2V_V3.safetensors

ModelScope CN (1 mirrors)

LTX2.3_reasoning_I2V_V3.safetensors

This was the first VBVR lora trained for LTX 2.3

Feel free to mirror any of my loras to HuggingFace

Wall of text with a bunch of info below

Better motion, smaller file size.

In comfyui or wan2gp lowering image strength to 0.85 can improve motion in general if you want more motion

Feedback and A-B comparisons welcome. V2 and V1 was trained on 4800 videos.

Prompting tips in non-nsfw terms so its less confusing just adapt it to nsfw:

Be specific and literal. Describe what happens, in what order, step by step.

Instead of "a ball bouncing around" → "A red ball moves to the right, bounces off the wall, and returns to the center"

Instead of "fluid pouring" → "Water flows from the left container through the connecting tube into the right container until both levels are equal"

Describe the starting state, the action, and the end state

The LoRA follows prompts more literally than base LTX — precise prompts will give much better results

How was it made?

v0.1 and v0.2 were trained on 360 videos from the VBVR (Very Big Video Reasoning) dataset synthetic task videos where every motion is precise and intentional. No concept bleed, no style change, just tighter control.

Based on the paper "A Very Big Video Reasoning Suite" which demonstrated this approach on Wan 2.2. I noticed that lora helped prompt following and temporal consistency a ton with wan so I am training this version for LTX.

What does it actually do?

Prompt following is more faithful — the model does more of what you asked instead of improvising

Motion is more deliberate and less erratic

Reduces random drift and wobble in scenes

Temporal consistency improved — actions follow logical sequences

What it doesn't do:

Doesn't change visual style

Doesn't add or remove capabilities LTX doesn't already have

Not a motion LoRA — stacks with motion LoRA's

Training details for v0.1 and v0.2 (if you give a shit)

Rank 32

360 VBVR synthetic videos at 512x512, 81 frames <------Alot less than 1 million but still a shitload to train on this is very slow to train locally.

LR 1e-4, adamw8bit

Training details for V1

Training videos were increased to 4800

Resolution is the same but frames were increased to 121

Every other setting the same as v0.1 and v0.2

More training data from the VBVR dataset was added to v1

Below is the new dataset I trained on's data composition if your curious

Tier 1 — Physics and Motion (3,400 samples)

Tier 2 — Spatial and Reasoning (1,420 samples)

The overall dataset is weighted roughly 70/30 toward physical motion and transformation tasks over abstract spatial reasoning, All of these are taken from the VBVR dataset I am not the creator of the dataset. I'm pretty new to lora training so if you have tips let me know.

REMEMBER its not X, its Y.

Disclaimer & Terms of Use

This model is provided "AS IS", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and noninfringement.

IN NO EVENT SHALL MisticRain69 BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL OR ANY OUTPUT GENERATED WITH IT.

18+ only. You, the user, are solely responsible for anything you generate with this model and for ensuring your use complies with all applicable laws and the CivitAI Terms of Service.

You may not:

produce anything illegal, non-consensual, defamatory, or used for harassment, deception, or fraud;

create any content depicting, or appearing to depict, minors (real or fictional) in a sexual context — no exceptions, ever;

use this model to depict any real, identifiable person — celebrity, public figure, or private individual — in any context, SFW or NSFW. No likeness use, period.

If you post outputs to the gallery that violate the CivitAI TOS, you'll be blocked and reported. Don't do this shit. Don't be fucking sus.

Description

FAQ

What is LTX 2.3 - I2V T2V Video Reasoning lora VBVR?

How do I use LTX 2.3 - I2V T2V Video Reasoning lora VBVR?

Why might this LoRA not be producing the expected results?

Can I use this LoRA commercially?

What files are available and where can I download them?

Comments (53)

Details

Files

LTX2.3_reasoning_I2V_V3.safetensors

Mirrors