✨ LTX2/2.3 — Image to video — Simple Workflow
A clean, all-in-one LTX2 image-to-video workflow built entirely with the UmeAiRT Toolkit for ComfyUI.
Only 12 nodes. No spaghetti wires. Just load your model, write your prompt, and hit generate.
⚠️ IMPORTANT — Nodes 2.0 Required
This workflow is built for the Nodes 2.0 (Vue) interface of ComfyUI. If you don't enable it, the workflow may have display problems.
How to activate Nodes 2.0:
Open ComfyUI
Go to Settings (⚙️ icon, bottom-left)
Find "Use Nodes V2 (Vue)" and toggle it ON
Refresh the page
Load the workflow
If you prefer the classic interface, check out my Legacy version of this workflow instead (link).
🎯 Features
Text-to-Image generation
Automatic download of models in auto version
Built-in SeedVR2 upscaler — high-quality tiled upscaling (toggleable on/off) Slower than a classic upscaler, but significantly better quality
Full metadata embedding — your images are saved with all generation parameters, ready for online publishing and remixing
10 LoRA slots — with individual on/off toggles and strength control and you can connect as many other lora modules to each other for as many LoRA as you want.
📦 Custom Node Required
Only one custom node to install:
Install via ComfyUI Manager (search "UmeAiRT") or use the UmeAiRT Auto-Installer.
The Toolkit packages everything internally — upscaler, face detailer, metadata saver. No other custom nodes needed.
📂 Files you need (in manual version)
LTX2.3 :
LTX2 Quant Model: base, fp8, Q8, Q6, Q5, Q4, Q3
In models/unet
GEMMA-3: fp4, Q8, Q6, Q5, Q4, Q3
in models/clip
TEXT ENCODER: ltx-2.3_text_projection_bf16.safetensors
in models/clip
VIDEO VAE: LTX23_video_vae_bf16.safetensors
in models/vae
AUDIO VAE: LTX23_audio_vae_bf16.safetensors
in models/vae
Upscale model: ltx-2.3-spatial-upscaler-x2-1.1.safetensors
in models/latent_upscale_models
LTX2 :
LTX2 Quant Model: Q8, Q6, Q5, Q4, Q3 (i recommend Q5)
In models/unet
MelBandRoFormer : MelBandRoformer_fp32.safetensors
in models/diffusion_models
GEMMA-3: Q8, Q6, Q5, Q4, Q3 (i recommend Q4)
in models/clip
TEXT ENCODER: ltx-2-19b-embeddings_connector_dev_bf16.safetensors
in models/clip
VIDEO VAE: TX2_video_vae_bf16.safetensors
in models/vae
AUDIO VAE: LTX2_audio_vae_bf16.safetensors
in models/vae
Upscale model: ltx-2-spatial-upscaler-x2-1.0.safetensors
in models/latent_upscale_models
Recommanded LoRA:
in models/loras
Description
base version
FAQ
Comments (12)
Thank you for the links to all the files, it makes it so much easier than having to track them all down!
Is your ComfyUI up to date?
@UmeAiRT yes
@UmeAiRT node DualClipLoader GGUF dont support LTX2. Not working
@ApchXi update your GGUF loaders in the comfyui manager installed nodes, update to latest. Worked for me!
I wondered if you happend to have any workflows for illustrius. i've found character loras for that that i can't find for other models.
I have more than 5 workflows for Illustrious : https://civitai.com/collections/14887413
@UmeAiRT oh thank you! I had heard that SD wasn't the same model as illustrius not compatible. I actually have these workflows!
I aso had another question, is there something similar to PuLID that works for blackwell GPU's? I saw that workflow was depreciated. i'm looking for something that could do image to image or keep a face consistant in generation making smaller edits. masks seem to only do so much. i was looking for a bit more control over the imaging.
Good workflow. Face consistency from image when talking seems not great most of the time, OK sometimes. Anything to combat this?
gemma 3 12B it fp8 Text Encoder Not yet working ?
raise ValueError(f"Unexpected text model architecture type in GGUF file: {arch_str!r}")
ValueError: Unexpected text model architecture type in GGUF file: 'gemma3'
having trouble
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x3840 and 1920x4096)

