update : April 14th 2026 : Lightricks has updated their LTX 2.3 distilled model to 1.1 (and Lora):
Model (1.1 fp8 _scaled by Kijai): https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/diffusion_models
dist. Lora 1.1 : https://huggingface.co/Lightricks/LTX-2.3/tree/main
V2.5 LTX-2.3 DEV & Distilled Video with Audio
Image to Video and a Text to Video workflow, both can use own Prompts or Ollama generated/enhanced prompts.
works with latest LTX 2.3 Distilled model (8 steps, CFG=1) or Dev model (20 steps, CFG=3)
Updated the processing for DISTILLED and DEV model, select the DIST or DEV model in loader node and switch to dedicated DIST or DEV processing pipeline, so each model has its own processing.
DIST model pipeline: Standard Guider and Basic Scheduler, follows the manual sigmas issued by Lightricks
DEV model pipeline: MultiModal Guider and LTX Scheduler + Distilled Lora on latent upscaler
Included a workflow version with "RTX Video Super Resolution" node, which upscales videos in highspeed.
Tip: With latest Comfy and LTX updates, the processing got faster for me, so I can increase the scale_by in sampler node from 0.5 to 0.6 or higher to have crisper videos with minor impact on render time.
V2.3 LTX-2.3 DEV & Distilled Video with Audio
Downloads for LTX 2.3:
LTX-2.3 Distilled & Dev Models (fp8_scaled): https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/diffusion_models
Textencoder1: (fp8_e4m3fn, same as LTX-2): https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/tree/main
Textencoder2: (projection_bf16): https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/text_encoders
Video & Audio Vae: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/vae
Loras:
Spartial upscaler (x2-1.1): https://huggingface.co/Lightricks/LTX-2.3/tree/main
Distilled Lora for upscaler (lora.384): https://huggingface.co/Lightricks/LTX-2.3/tree/main
Smaller, alternative Desitilled Lora by Kijai: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/loras
Detailer Lora (same as LTX-2): https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer/tree/main
Ollama Model (prompt only, fast): https://ollama.com/mirage335/Llama-3-NeuralDaredevil-8B-abliterated-virtuoso
alternative model with Vision (reads input image+prompt, slower): https://ollama.com/huihui_ai/qwen3-vl-abliterated
other model with Vision (great for I2V): https://ollama.com/huihui_ai/qwen3.5-abliterated
smaller LTX 2.3 GGUF Dev or Dist. models work as well. (replace Checkpoint loader node with Unet loader node from this custom node: https://github.com/city96/ComfyUI-GGUF ):
models: https://huggingface.co/unsloth/LTX-2.3-GGUF/tree/main
save to models/unet/
V1.5 LTX-2 DEV Video with Audio including latest π π £π § Multimodal Guider
Image to Video and a Text to Video workflow, both can use own Prompts or Ollama generated/enhanced prompts.
Replaced the Guider node with latest Multimodal Guider node, see more details in WF notes or here: https://ltx.io/model/model-blog/ltx-2-better-control-for-real-workflows Before we had 1 CFG parameter for audio and video. With multimodal guider, we now can tweak audio and video seperately with even more parameters...
added a Power Lora Loader node to inject further Loras
use Image to Video Adapter Lora to improve motion for I2V: https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/tree/main
replaced a node to no longer require comfymath custom nodes
V1.0 LTX-2 DEV Video with Audio:
Image to Video and a Text to Video workflow with own Prompts or Ollama generated/enhanced prompts.
setup for the LTX2 Dev model.
uses Detailer Lora for better quality and LTX tiled VAE to avoid OOM and visual grids
2 pass rendering (motion+upscale). Upscale process uses distilled and spatial upscale Lora
setup with latest LTXVNormalizingSampler to increase video & audio quality.
Text to Video can use dynamic prompts with wildcards.
Download LTX-2 Files: (Workflow V1.0 and V1.5 only)
Find Model/Lora Loader nodes within Sampler Subgraph node.
- LTX2 Dev Model (dev_Fp8): https://huggingface.co/Lightricks/LTX-2/tree/main
- Detailer Lora: https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer/tree/main
- Distilled (lora-384) & Spatial upscaler Lora: https://huggingface.co/Lightricks/LTX-2/tree/main
- VAE (already included in above dev_FP8 model, but needed if you go for GGUF models): https://huggingface.co/Lightricks/LTX-2/tree/main/vae
- Textencoder (fp8_e4m3fn): https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/tree/main
- Image to Video Adapter Lora (more motion with I2V): https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/tree/main
Save Location:
π ComfyUI/
βββ π models/
β βββ π checkpoints/
β β βββ ltx-2-19b-dev-fp8.safetensors
β βββ π text_encoders/
β β βββ gemma_3_12B_it_fp8_e4m3fn.safetensors
β βββ π loras/
β β βββ ltx-2-19b-distilled-lora-384.safetensors
β βββ π latent_upscale_models/
β βββ ltx-2-spatial-upscaler-x2-1.0.safetensors
β βββ π Clip/
β βββ ltx-2.3_text_projection_bf16.safetensors
Custom Nodes used:
https://github.com/Comfy-Org/Nvidia_RTX_Nodes_ComfyUI (RTX VSR Version)
Text 2 Video only:
Ollama help:
Install Ollama from https://ollama.com/
download a model: Go to a model page, chose a model , then hit the copy button, i.e. https://ollama.com/huihui_ai/qwen3-vl-abliterated
open terminal and paste the model name, i.e.: ollama run huihui_ai/qwen3-vl-abliterated
model will be downloaded and can be selected in green comfy node "Ollama Connectivity". Hit "Reconnect" to refresh.
Example longer Video
Description
LTX-2.3 Text to Video and Image to Video with Ollama or own prompts Dev and Dist model
Versions with new RTX Video Super Resolution included
FAQ
Comments (29)
I just recovered from a "break everything" update on March 7.. do i really need to update again to use this workflow, or is that update date ok?
Should work work without updating comfy, if your last update was march 7th. The RTX workflow might require a comfy update tho. I usually have a 2nd comfy install with no models, just to test comfy updates before updating my main comfy installation.
@tremolo28Β Got it, will test it without updating, but if needed, I'll just have to take the plunge. Thanks!
Howsit bruv, thanks for the workflow. I'm new to LTX. You say the audio vae is embedded in the checkpoint. What if I want to switch to gguf, do all ggufs in the link you shared (https://huggingface.co/unsloth/LTX-2.3-GGUF/tree/main) have the same audio vae properties?
Hi there, if you want to use LTX-2.3 GGUF model, the only thing you need to do is download it and replace the checkpoint loader node to a unet loader node and let it load the GGUF model, anything else (vae, textencoder, loras, etc) remains the same.
The vae is only embedded in the LTX-2 version, for the LTX-2.3 version, everything is seperated, as per the download list.
I am having a crash when I run the workflow. It happens when I reach the 'LTX Audio Text Encoder Loader' that has the 'gemma_3_12B_it_fp8_e4m3fn.safetensors' model loaded onto it. Is the model perhaps too big? I'm on a RTX5060ti 16Vram and 32 ram.
Supposed to work with 16g vram. With rtx 5 series you probably can use a smaller fp4 gemma model. I think the comfy ltx template shows a dl link for that. But i doubt that is the real issue. What does comfy say when crashing?
@tremolo28Β It is one of those crashes that just happens with no pop up window, just that red "reconnecting" box at the top right hand corner. I should point out however the T2V workflow works perfectly. It's the I2V workflows that can't make it past that node.
If you are interested in the last few lines that show in the terminal it's this:
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded.
Requested to load VideoVAE
loaded completely; 8073.80 MB usable, 2331.69 MB loaded, full load: True
C:\ComfyUI_windows_portable>pause
Press any key to continue . . .
@RandoWandoΒ the T2V workflow uses the same loader for textencoder like the I2V workflow. Maybe compare and double check the node show both gemma and text_projection are set. try rightclick and reload node.
Wow... Excellent workflow. Would it be possible to simplify the image sizing? I've seen some other workflows that incorporate a picker that lets you select rough image dimensions.
That RTX superscale thingy is somewhat subtle, but it's hard to go without once you tried... well I guess my video gen just took more time to finish from now on :)
Initially, I made negative comments about the workflow due to prejudice and incorrect setup. However, your helpful and constructive approach to the technical aspects of the workflow truly elevated the setup to a technical level, and I thank you for successfully converting the performance I couldn't get from other workflows into profit with unexpected consistency and speed.
Hi, you can see from the examples posted by me and other users, those do not look terrible or disastrous, you can even download the clip and throw it into comfy as those contain the workflow, prompt and settings in the metadata, if you get terrible results with it, your setup might have an issue β¦
@tremolo28Β What settings and model would you recommend for vertical video production and facial consistency?
@BocekAdamΒ You can chose between distilled and dev model, both have pros and cons. Dist is faster, delivers better image quality, but lacks a little of variation, for me Dist wins due to speed, and I am ok with itΒ΄s quality, almost 95% of clips I posted are with Dist model.
Dev. model is more ore less the opposite. Both have a bit of issues with facial consistency. Think there are Loras that can help with that.
@tremolo28Β Thank you for the quick reply, I will try your suggestions, it really does produce results quickly. However, I'm having a problem at this stage. For example, when I enter a scale in a single process without upscaling: vertical format width 760 height 1280, or I try different vertical inputs, the video comes out horizontally no matter what I enter. Upscaling seems to solve this problem; it seems to be fixed when I enter the LTX part. Is there anything special required for this? Is this normal for results obtained in a single process line without upscaling? Is it a factor that affects quality even with upscaling enabled? I'm curious. I generally want to render my videos vertically.
@BocekAdamΒ If you use the workflow with RTX Video Super Resolution, it is required to also set the final resolution to the desired width/height there, it is the black node on the right above final output.
You can also switch RTX to "resize by multiplier" to like 2. With your 760x1280 you can then get double resolution as output.
Edit: The render process works with 2 sampler in sequence;
1. sampler 1 creates the motion, the scale_by factor defines at which resolution the render is done. 1 = uses the full resolution you have set in width/height (very slow). Scale_by 0.5 renders with half the resolution (=default).
2. sampler 2 upscales the video (always x2) in 3 steps with distilled lora. It is a spatial upscaler that has much more info for scaling as it works in latent space vs. the usual upscalers, that just upscale pixels.
3. RTX VSR upscales video to a final resolution very quick.
How to use Ollama prompt generator in v2.5 ? I cant see ??
V2.5 LTX-2.3 DEV & Distilled Video with Audio
Image to Video and a Text to Video workflow, both can use own Prompts or Ollama generated/enhanced prompts.
You enter a prompt and it is sent to ollama, the group on the left, that is labeled as Ollama.
There is a switch to toggle ollama off, so you can run you own prompt without ollama, it is on by default.
I noticed that any loras I load in the power loader after the detailer on first pass seem to be ignored. If I disable the main process detailer, those loras start working.
Hey. did not test a lot of Loras with PowerLoraloader, which is set after the detailer lora. You could try to load the detailer in powerloder, after your lora. If this still not works, then the detailer (which is a LTX-2 Lora), might not like additional loras...
Workaround, as you say, would be to switch detailer off on main process, when using other loras. Or maybe lower strength of lora.
meanwhile tested some loras and it seems like the detailer loras need to be off or with lower strength (<0.5) on both main process and upscaler, when using loras in the power lora loader node.
Also the upscaler process requires a lora processing too for some loras, which is currently not setup. To fix, easiest way is to add another power lora loader node after the detailer lora for the upscaler and connect it with the subgraph for the sampler.
Amazing to see a workflow work so well without significant tinkering! A+
I will note that to fix RTX I had to follow this: https://github.com/Comfy-Org/Nvidia_RTX_Nodes_ComfyUI/issues/11
how to add video generation preview to your process?, or maybe you will add this feature to the next version, it is convenient to view the current process
Hi, good point. I think comfy will be updated to have a proper LTX2.3 preview.
Until then here is a workaround:
From Kijas nodes use "LTX2 Sampling Preview Override" and hook it up between powerlora loader output and Guider Input (within Dev or Distilled pipeline on the left), so connect model in/out. It will then extend the sampler subgraph with the preview, you might need to move the Prompt node below out of the way.
You can add this vae to have the preview less pixelated: https://github.com/madebyollin/taehv/blob/main/safetensors/taeltx2_3.safetensors
Here is a screenshot: https://github.com/kijai/ComfyUI-KJNodes/issues/566#issuecomment-4016594336
@tremolo28Β thanks
The spatial upscaler has been updated to version 1.1 which supposed to prevent logo appearing at the end of longer clips and reduce flickering:
https://huggingface.co/Lightricks/LTX-2.3/tree/main
ltx-2.3-spatial-upscaler-x2-1.1.safetensors
great improvement vs v1.0
I was wondering where that end of video logo stuff was happening, thanks for the info
I'm really impressed with 2.3. I'm getting awesome results straight out of the box. Baked-in audio and sync is just awesome. There is some new architecture to learn, but it's not annoying as I thought it would be. The only thing I had to add myself was a TTS voice changer node with engine and character loader. Thanks for sharing, I usually build from scratch when new models come out but this one had me scratching my head for a bit. Definitely worth checking out, everyone.