Pose, Image, Audio to Video.
v1: uses distilled model, faster inference but can result in plastic looking skin
v2: uses dev model, longer inference, result in more natural looking skin
use --reserve-vram 1 launch options if you are facing OOM issues.
Tested on 16GB vram, 64GB system ram, 1600 x 900 resolution, 121 frames.
Description
Initial Release
Details
Downloads
194
Platform
CivitAI
Platform Status
Available
Created
1/25/2026
Updated
2/1/2026
Deleted
-
Files
ltx2PoseImageAudioTo_v10.zip
Mirrors
Huggingface (1 mirrors)
CivitAI (1 mirrors)