LTX-2.3 DEV/DIST - IMAGE to Video and TEXT to Video with Ollama/RTX VSR

LTX-2.3 DEV/DIST - IMAGE to Video and TEXT to Video with Ollama/RTX VSR - v2.5 LTX-2.3-RTX

NSFW

update : April 14th 2026 : Lightricks has updated their LTX 2.3 distilled model to 1.1 (and Lora):

Model (1.1 fp8 _scaled by Kijai): https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/diffusion_models

dist. Lora 1.1 : https://huggingface.co/Lightricks/LTX-2.3/tree/main

V2.5 LTX-2.3 DEV & Distilled Video with Audio

Image to Video and a Text to Video workflow, both can use own Prompts or Ollama generated/enhanced prompts.

works with latest LTX 2.3 Distilled model (8 steps, CFG=1) or Dev model (20 steps, CFG=3)
Updated the processing for DISTILLED and DEV model, select the DIST or DEV model in loader node and switch to dedicated DIST or DEV processing pipeline, so each model has its own processing.
- DIST model pipeline: Standard Guider and Basic Scheduler, follows the manual sigmas issued by Lightricks
- DEV model pipeline: MultiModal Guider and LTX Scheduler + Distilled Lora on latent upscaler
Included a workflow version with "RTX Video Super Resolution" node, which upscales videos in highspeed.

Tip: With latest Comfy and LTX updates, the processing got faster for me, so I can increase the scale_by in sampler node from 0.5 to 0.6 or higher to have crisper videos with minor impact on render time.

V2.3 LTX-2.3 DEV & Distilled Video with Audio

Downloads for LTX 2.3:

LTX-2.3 Distilled & Dev Models (fp8_scaled): https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/diffusion_models
Textencoder1: (fp8_e4m3fn, same as LTX-2): https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/tree/main
Textencoder2: (projection_bf16): https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/text_encoders
Video & Audio Vae: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/vae
Loras:
- Spartial upscaler (x2-1.1): https://huggingface.co/Lightricks/LTX-2.3/tree/main
- Distilled Lora for upscaler (lora.384): https://huggingface.co/Lightricks/LTX-2.3/tree/main
  - Smaller, alternative Desitilled Lora by Kijai: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/loras
- Detailer Lora (same as LTX-2): https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer/tree/main
Ollama Model (prompt only, fast): https://ollama.com/mirage335/Llama-3-NeuralDaredevil-8B-abliterated-virtuoso
- alternative model with Vision (reads input image+prompt, slower): https://ollama.com/huihui_ai/qwen3-vl-abliterated
- other model with Vision (great for I2V): https://ollama.com/huihui_ai/qwen3.5-abliterated

smaller LTX 2.3 GGUF Dev or Dist. models work as well. (replace Checkpoint loader node with Unet loader node from this custom node: https://github.com/city96/ComfyUI-GGUF ):

models: https://huggingface.co/unsloth/LTX-2.3-GGUF/tree/main
save to models/unet/

V1.5 LTX-2 DEV Video with Audio including latest 🅛🅣🅧 Multimodal Guider

Image to Video and a Text to Video workflow, both can use own Prompts or Ollama generated/enhanced prompts.

Replaced the Guider node with latest Multimodal Guider node, see more details in WF notes or here: https://ltx.io/model/model-blog/ltx-2-better-control-for-real-workflows Before we had 1 CFG parameter for audio and video. With multimodal guider, we now can tweak audio and video seperately with even more parameters...

added a Power Lora Loader node to inject further Loras
use Image to Video Adapter Lora to improve motion for I2V: https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/tree/main
replaced a node to no longer require comfymath custom nodes

V1.0 LTX-2 DEV Video with Audio:

Image to Video and a Text to Video workflow with own Prompts or Ollama generated/enhanced prompts.

setup for the LTX2 Dev model.
uses Detailer Lora for better quality and LTX tiled VAE to avoid OOM and visual grids
2 pass rendering (motion+upscale). Upscale process uses distilled and spatial upscale Lora
setup with latest LTXVNormalizingSampler to increase video & audio quality.
Text to Video can use dynamic prompts with wildcards.

Download LTX-2 Files: (Workflow V1.0 and V1.5 only)

Find Model/Lora Loader nodes within Sampler Subgraph node.

- LTX2 Dev Model (dev_Fp8): https://huggingface.co/Lightricks/LTX-2/tree/main

- Detailer Lora: https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer/tree/main

- Distilled (lora-384) & Spatial upscaler Lora: https://huggingface.co/Lightricks/LTX-2/tree/main

- VAE (already included in above dev_FP8 model, but needed if you go for GGUF models): https://huggingface.co/Lightricks/LTX-2/tree/main/vae

- Textencoder (fp8_e4m3fn): https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/tree/main

- Image to Video Adapter Lora (more motion with I2V): https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/tree/main

Save Location:

📂 ComfyUI/
├── 📂 models/
│ ├── 📂 checkpoints/
│ │ ├── ltx-2-19b-dev-fp8.safetensors
│ ├── 📂 text_encoders/
│ │ └── gemma_3_12B_it_fp8_e4m3fn.safetensors
│ ├── 📂 loras/
│ │ ├── ltx-2-19b-distilled-lora-384.safetensors
│ └── 📂 latent_upscale_models/
│ └── ltx-2-spatial-upscaler-x2-1.0.safetensors
│ └── 📂 Clip/
│ └── ltx-2.3_text_projection_bf16.safetensors

Custom Nodes used:

Ollama help:

Install Ollama from https://ollama.com/
download a model: Go to a model page, chose a model , then hit the copy button, i.e. https://ollama.com/huihui_ai/qwen3-vl-abliterated
open terminal and paste the model name, i.e.: ollama run huihui_ai/qwen3-vl-abliterated
model will be downloaded and can be selected in green comfy node "Ollama Connectivity". Hit "Reconnect" to refresh.

Example longer Video

Description

LTX-2.3 Text to Video and Image to Video with Ollama or own prompts Dev and Dist model

Versions with new RTX Video Super Resolution included

FAQ

Comments (29)

ArtificialOtakuMar 12, 2026· 1 reaction

CivitAI

I just recovered from a "break everything" update on March 7.. do i really need to update again to use this workflow, or is that update date ok?

tremolo28

Author

Mar 12, 2026· 1 reaction

Should work work without updating comfy, if your last update was march 7th. The RTX workflow might require a comfy update tho. I usually have a 2nd comfy install with no models, just to test comfy updates before updating my main comfy installation.

ArtificialOtakuMar 12, 2026· 1 reaction

@tremolo28 Got it, will test it without updating, but if needed, I'll just have to take the plunge. Thanks!

DasbestosMar 12, 2026· 1 reaction

CivitAI

Howsit bruv, thanks for the workflow. I'm new to LTX. You say the audio vae is embedded in the checkpoint. What if I want to switch to gguf, do all ggufs in the link you shared (https://huggingface.co/unsloth/LTX-2.3-GGUF/tree/main) have the same audio vae properties?

tremolo28

Author

Mar 12, 2026· 1 reaction

Hi there, if you want to use LTX-2.3 GGUF model, the only thing you need to do is download it and replace the checkpoint loader node to a unet loader node and let it load the GGUF model, anything else (vae, textencoder, loras, etc) remains the same.

The vae is only embedded in the LTX-2 version, for the LTX-2.3 version, everything is seperated, as per the download list.

RandoWandoMar 12, 2026· 1 reaction

CivitAI

I am having a crash when I run the workflow. It happens when I reach the 'LTX Audio Text Encoder Loader' that has the 'gemma_3_12B_it_fp8_e4m3fn.safetensors' model loaded onto it. Is the model perhaps too big? I'm on a RTX5060ti 16Vram and 32 ram.

tremolo28

Author

Mar 12, 2026· 1 reaction

Supposed to work with 16g vram. With rtx 5 series you probably can use a smaller fp4 gemma model. I think the comfy ltx template shows a dl link for that. But i doubt that is the real issue. What does comfy say when crashing?

RandoWandoMar 12, 2026· 1 reaction

@tremolo28 It is one of those crashes that just happens with no pop up window, just that red "reconnecting" box at the top right hand corner. I should point out however the T2V workflow works perfectly. It's the I2V workflows that can't make it past that node.
If you are interested in the last few lines that show in the terminal it's this:

VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16

no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded.

Requested to load VideoVAE

loaded completely; 8073.80 MB usable, 2331.69 MB loaded, full load: True

C:\ComfyUI_windows_portable>pause

Press any key to continue . . .

tremolo28

Author

Mar 13, 2026· 1 reaction

@RandoWando the T2V workflow uses the same loader for textencoder like the I2V workflow. Maybe compare and double check the node show both gemma and text_projection are set. try rightclick and reload node.

ChahkMar 13, 2026· 2 reactions

CivitAI

Wow... Excellent workflow. Would it be possible to simplify the image sizing? I've seen some other workflows that incorporate a picker that lets you select rough image dimensions.

BlastFromThePresentMar 14, 2026· 3 reactions

CivitAI

That RTX superscale thingy is somewhat subtle, but it's hard to go without once you tried... well I guess my video gen just took more time to finish from now on :)

BocekAdamMar 14, 2026· 1 reaction

CivitAI

Initially, I made negative comments about the workflow due to prejudice and incorrect setup. However, your helpful and constructive approach to the technical aspects of the workflow truly elevated the setup to a technical level, and I thank you for successfully converting the performance I couldn't get from other workflows into profit with unexpected consistency and speed.

tremolo28

Author

Mar 14, 2026· 1 reaction

Hi, you can see from the examples posted by me and other users, those do not look terrible or disastrous, you can even download the clip and throw it into comfy as those contain the workflow, prompt and settings in the metadata, if you get terrible results with it, your setup might have an issue …

BocekAdamApr 11, 2026

@tremolo28 What settings and model would you recommend for vertical video production and facial consistency?

tremolo28

Author

Apr 11, 2026· 1 reaction

@BocekAdam You can chose between distilled and dev model, both have pros and cons. Dist is faster, delivers better image quality, but lacks a little of variation, for me Dist wins due to speed, and I am ok with it´s quality, almost 95% of clips I posted are with Dist model.

Dev. model is more ore less the opposite. Both have a bit of issues with facial consistency. Think there are Loras that can help with that.

BocekAdamApr 11, 2026

@tremolo28 Thank you for the quick reply, I will try your suggestions, it really does produce results quickly. However, I'm having a problem at this stage. For example, when I enter a scale in a single process without upscaling: vertical format width 760 height 1280, or I try different vertical inputs, the video comes out horizontally no matter what I enter. Upscaling seems to solve this problem; it seems to be fixed when I enter the LTX part. Is there anything special required for this? Is this normal for results obtained in a single process line without upscaling? Is it a factor that affects quality even with upscaling enabled? I'm curious. I generally want to render my videos vertically.

tremolo28

Author

Apr 11, 2026· 1 reaction

@BocekAdam If you use the workflow with RTX Video Super Resolution, it is required to also set the final resolution to the desired width/height there, it is the black node on the right above final output.

You can also switch RTX to "resize by multiplier" to like 2. With your 760x1280 you can then get double resolution as output.

Edit: The render process works with 2 sampler in sequence;

1. sampler 1 creates the motion, the scale_by factor defines at which resolution the render is done. 1 = uses the full resolution you have set in width/height (very slow). Scale_by 0.5 renders with half the resolution (=default).

2. sampler 2 upscales the video (always x2) in 3 steps with distilled lora. It is a spatial upscaler that has much more info for scaling as it works in latent space vs. the usual upscalers, that just upscale pixels.

3. RTX VSR upscales video to a final resolution very quick.

CyberAImaniaMar 14, 2026· 2 reactions

CivitAI

How to use Ollama prompt generator in v2.5 ? I cant see ??

V2.5 LTX-2.3 DEV & Distilled Video with Audio

Image to Video and a Text to Video workflow, both can use own Prompts or Ollama generated/enhanced prompts.

tremolo28

Author

Mar 14, 2026· 1 reaction

You enter a prompt and it is sent to ollama, the group on the left, that is labeled as Ollama.

There is a switch to toggle ollama off, so you can run you own prompt without ollama, it is on by default.

ChahkMar 14, 2026· 2 reactions

CivitAI

I noticed that any loras I load in the power loader after the detailer on first pass seem to be ignored. If I disable the main process detailer, those loras start working.

tremolo28

Author

Mar 14, 2026· 1 reaction

Hey. did not test a lot of Loras with PowerLoraloader, which is set after the detailer lora. You could try to load the detailer in powerloder, after your lora. If this still not works, then the detailer (which is a LTX-2 Lora), might not like additional loras...

Workaround, as you say, would be to switch detailer off on main process, when using other loras. Or maybe lower strength of lora.

tremolo28

Author

Mar 21, 2026· 2 reactions

meanwhile tested some loras and it seems like the detailer loras need to be off or with lower strength (<0.5) on both main process and upscaler, when using loras in the power lora loader node.

Also the upscaler process requires a lora processing too for some loras, which is currently not setup. To fix, easiest way is to add another power lora loader node after the detailer lora for the upscaler and connect it with the subgraph for the sampler.

ZoopyboopyMar 15, 2026· 1 reaction

CivitAI

Amazing to see a workflow work so well without significant tinkering! A+

I will note that to fix RTX I had to follow this: https://github.com/Comfy-Org/Nvidia_RTX_Nodes_ComfyUI/issues/11

dav79mail156Mar 15, 2026

CivitAI

how to add video generation preview to your process?, or maybe you will add this feature to the next version, it is convenient to view the current process

tremolo28

Author

Mar 16, 2026

Hi, good point. I think comfy will be updated to have a proper LTX2.3 preview.

Until then here is a workaround:

From Kijas nodes use "LTX2 Sampling Preview Override" and hook it up between powerlora loader output and Guider Input (within Dev or Distilled pipeline on the left), so connect model in/out. It will then extend the sampler subgraph with the preview, you might need to move the Prompt node below out of the way.

You can add this vae to have the preview less pixelated: https://github.com/madebyollin/taehv/blob/main/safetensors/taeltx2_3.safetensors

Here is a screenshot: https://github.com/kijai/ComfyUI-KJNodes/issues/566#issuecomment-4016594336

dav79mail156Mar 16, 2026

@tremolo28 thanks

tremolo28

Author

Mar 16, 2026· 5 reactions

CivitAI

The spatial upscaler has been updated to version 1.1 which supposed to prevent logo appearing at the end of longer clips and reduce flickering:

https://huggingface.co/Lightricks/LTX-2.3/tree/main

ltx-2.3-spatial-upscaler-x2-1.1.safetensors

great improvement vs v1.0

BlastFromThePresentMar 17, 2026· 1 reaction

I was wondering where that end of video logo stuff was happening, thanks for the info

Ponder_StibbonsMar 23, 2026· 3 reactions

CivitAI

I'm really impressed with 2.3. I'm getting awesome results straight out of the box. Baked-in audio and sync is just awesome. There is some new architecture to learn, but it's not annoying as I thought it would be. The only thing I had to add myself was a TTS voice changer node with engine and character loader. Thanks for sharing, I usually build from scratch when new models come out but this one had me scratching my head for a bit. Definitely worth checking out, everyone.

Workflows

LTXV 2.3

by tremolo28

Download (Beta) View on CivitAI