Version 3.0 has been updated to use LTX 2.3
LTX 2.3 should provide some enhancements such as sharper video
Better audio
Etc..
The normal models are HUGE, can't be used by most people.
This workflow used GGUF for model and text endcocder, gemma.
Trying to keep footprint small, but its getting hard to do.
**************LTX 2.3 VERSION*************
https://huggingface.co/Lightricks/LTX-2.3/tree/main (upscaler)
https://huggingface.co/QuantStack/LTX-2.3-GGUF/tree/main/LTX-2.3-distilled (main gguf)
https://huggingface.co/Kijai/LTX2.3_comfy/tree/main (vae, text projector)
./models/text_encoders
gemma-3-12b-it-Q2_K.gguf (this is only 4GB for low vram)
ltx-2.3_text_projection_bf16.safetensors
./models/unet (DISTILLED version, distilled only needs 8 steps)
LTX-2.3-distilled-Q5_K_S.gguf (Distilled version provides benefits for low vram)
./models/vae (2.3)
LTX23_audio_vae_bf16.safetensors
LTX23_video_vae_bf16.safetensors
./models/latent_upscale_models (2.3)
ltx-2.3-spatial-upscaler-x2-1.0.safetensors

Models are in a Subgraph.


Description
First release - LTX-2 long video
FAQ
Comments (9)
TKVideoUserInputs ?
Where to get it?
after some digging i found it - you can get it here https://github.com/trashkollector/TKNodes
It should appear if you select "Find Missing Nodes" I hope it appeared there.
You will need to search for "Handy Nodes" from the Manager. But here are 2 Handy Nodes, make sure to get the one that has Video User Inputs
@trashkollector175 It does not
Can you upload custom audio with this workflow? that's a way to solve the voice change problem...
Yes you can.. I am going to look into that.
I've been able to bash it over the head a little bit with prompting: Coherent, cohesive continuation of previous dialog with newly generated clear, understandable speech using the exact voice from the reference video, while filtering out any any audio 'hiss', 'pop' or 'crackle'
His voice matches the original speaker exactly in tone, pitch, cadence, pacing, and microphone character. His mouth and lip-sync match exactly during the transition between original footage and new content.
@firecrocs Interesting... I will try this.