LTX 2.3 Video to Video Fast with GGUF - CivArchive (CivitAI Archive)

I improved as best I could and made it use video to video and have had amazing results. With my RTX 5090 with 64gb ram I can make these in under a minute.

🧠 Model Configuration Overview

🔹 Base UNet

• Model: ltx-2-3-22b-dev-Q4_K_M.gguf

• Type: UNet (quantized GGUF)

⸻

🔹 Distilled LoRA

• LoRA: ltx-2.3-22b-distilled-lora-dynamic_fro09_avg_rank_105_bf16.safetensors

• Strength: 0.60

• Type: Distilled LoRA (bf16)

⸻

🔹 Text Encoders (Dual CLIP)

• CLIP 1: gemma_3_12B_it_fp4_mixed.safetensors

• Type: Text Encoder (Gemma, FP4 mixed)

• CLIP 2: ltx-2.3_text_projection_bf16.safetensors

• Type: Text Projection (bf16)

• Mode: ltxv

⸻

🔹 Audio VAE

• Model: LTX23_audio_vae_bf16.safetensors

• Device: main_device

• Precision: bf16

• Type: Audio VAE

⸻

🔹 Video VAE

• Model: LTX23_video_vae_bf16.safetensors

• Device: main_device

• Precision: bf16

• Type: Video VAE

⸻

🔹 Upscaler

• Model: ltx-2.3-spatial-upscaler-x2-1.1.safetensors

• Type: Spatial Upscaler (x2)

Description

Details

Files

ltx23VideoToVideoFast_v10.zip

Mirrors