Hello, just a quick note to let you know that I created this workflow under complex conditions. I don't have my usual system and am forced to rent runpods for my tests. I haven't been able to perform as many checks and tests as usual, so please don't hesitate to send me a message if you encounter any problems.
Description:
This workflow allows you to generate video from text.
Resources you need:
📂Files :
LTX2 Quant Model: Q8, Q6, Q5, Q4, Q3 (i recommend Q5)
In models/unet
MelBandRoFormer : MelBandRoformer_fp32.safetensors
in models/diffusion_models
GEMMA-3: Q8, Q6, Q5, Q4, Q3 (i recommend Q4)
in models/clip
TEXT ENCODER: ltx-2-19b-embeddings_connector_dev_bf16.safetensors
in models/clip
VIDEO VAE: TX2_video_vae_bf16.safetensors
in models/vae
AUDIO VAE: LTX2_audio_vae_bf16.safetensors
in models/vae
Upscale model: ltx-2-spatial-upscaler-x2-1.0.safetensors
in models/latent_upscale_models
Recommanded LoRA:
in models/loras
📦Custom Nodes :

Description
Base version
FAQ
Comments (7)
Regarding latent upscale. Is it hidden? If not then do you have it at 1.0 by default?
On official workflow it is 0.5 default
You can look in the subgraph.
The generation is done at 0.5 and then upscaled to the chosen resolution.
Both version seem to fail at the generate audio node without uploading custom audio? "activate custom audio import" toggle doesnt seem to work. Let me know if I'm missing something.
You have to import an audio file even if you don't use it
With 24gb vram, GEMMA-3-Q4 is still recommended?
You can try other model but with a 4090 i use LTX2 Q8 and GEMMA3 Q4 and i have good result
After experiencing many problems with ltx-2, I came across your workflows and now I can run it without any issues. Thank you.

