This workflow takes an Image and an audio track as input to generate a video.
Important Notice
Update ComfyUI and KJ Nodes. A lot of the code has been updated in the last few days.
Include --reserve-vram 1 in your launch option to avoid OOM.
If you have no lipsync, try ensuring that your audio track is in stereo format. fix suggested by @thomasdimitri563
Models to download (LTX2.3)
Place in models/diffusion_models
Place in models/loras
https://huggingface.co/Lightricks/LTX-2.3/blob/main/ltx-2.3-22b-distilled-lora-384.safetensors
Place in models/text_encoders
Place in models/vae
https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/vae/LTX23_audio_vae_bf16.safetensors
https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/vae/LTX23_video_vae_bf16.safetensors
Models to download (V3)
Place in models/diffusion_models
https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b-distilled-fp8.safetensors
Place in models/text_encoders
Place in models/loras
Description
Changed to FP8 distilled model.
Set resolution at 1920 x 1088.
Changed to Manual Sigmas.
Changed to Native Video Save, to prevent saving 3 different files for final video.