LTX 2.3 basic GGUF 720p workflow - CivArchive (CivitAI Archive)

LTX 2.3 basic GGUF 720p workflow - v1.0

NSFW

This is same as default WF in ComfyUI, but it uses GGUF custom node. Basically, you can insert images, audio, and video into any frame, so anything is possible.

T2V, S2V, V2V, I2V First, last, middle frame.

voice clone: You can input a few seconds of audio, and then crop those same few seconds after the process is complete.

reference image: input a starting image and then instruct it to perform a completely different action. (However, the character descriptions remain the same.) Yes, this is what's called a failed I2V. Again, crop the initial image.

extend video: input the images and audio extracted from the video. It will be extended for the remaining length.

GGUF custom node: https://github.com/city96/ComfyUI-GGUF

(Please update your GGUF node and ComfyUI to the latest versions.)

LTX2.3 and other: https://huggingface.co/unsloth/LTX-2.3-GGUF/tree/main

LTX2.3 GGUF: https://huggingface.co/QuantStack/LTX-2.3-GGUF/tree/main/LTX-2.3-distilled

VAE: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/vae

upscale model: https://huggingface.co/Lightricks/LTX-2.3/tree/main

text encoder:

gemma3 GGUF: https://huggingface.co/unsloth/gemma-3-12b-it-GGUF/tree/main

embedding: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/text_encoders

Place the text encoder-related files here: ComfyUI\models\text_encoders

audio vae is here: ComfyUI\models\checkpoints

upscale model is here: ComfyUI\models\latent_upscale_models

Use the distilled model and distilled-embedding, or use the dev model and dev-embedding with distilled-lora.

T2V: set bypass image on

I2V: set bypass image off

You can bypass upscale node for lowres.

Try starting with a lower length (perhaps 9).

Description

FAQ

Comments (68)

6028976Jan 10, 2026· 2 reactions

CivitAI

I don't quite get it, if you download the gemma gguf you also need to download tokenizer thing isn't this already included in the gemma gguf ? and if so, where to put it ?

m8rr

Author

Jan 10, 2026

Place the text encoder-related files here: ComfyUI/models/text_encoders

and audio vae is here: ComfyUI\models\checkpoints

Check out this PR.

Pull Request #399 · city96/ComfyUI-GGUF

Pull Request #402 · city96/ComfyUI-GGUF

6028976Jan 10, 2026

@m8rr okay thanks ot worked after I replaced by this fork suggested here Or for an instant solution, you can just use this one, I've already merged 399 & 402 here.
https://github.com/muljanis45/ComfyUI-GGUF

6028976Jan 10, 2026

by the way, where are the steps count ? is it locked at 8 and not possible to change or am I missing something ?

m8rr

Author

Jan 10, 2026

@fouchardmilcoupes311 Yes, could say it's locked, it's the same as the official ComfyUI LTX 2 WF.

If you change the ManualSigmas node inside the subgraph to a BasicScheduler node, you'll see a familiar setting.

6028976Jan 10, 2026

@m8rr okay

hellosirJan 11, 2026

@fouchardmilcoupes311 Thanks, your fork made the errors disappear.

seedbr4rk_pee1Jan 10, 2026

CivitAI

i got this error - ot prompt

!!! Exception during processing !!! Unexpected text model architecture type in GGUF file: 'gemma3'

Traceback (most recent call last):

File "D:\ComfyUI\execution.py", line 518, in execute

output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\ComfyUI\execution.py", line 329, in get_output_data

return_values = await asyncmap_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\ComfyUI\execution.py", line 303, in asyncmap_node_over_list

await process_inputs(input_dict, i)

File "D:\ComfyUI\execution.py", line 291, in process_inputs

result = f(**inputs)

^^^^^^^^^^^

File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\nodes.py", line 266, in load_clip

return (self.load_patcher(clip_paths, clip_type, self.load_data(clip_paths)),)

^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\nodes.py", line 220, in load_data

sd = gguf_clip_loader(p)

^^^^^^^^^^^^^^^^^^^

File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\loader.py", line 374, in gguf_clip_loader

sd, arch = gguf_sd_loader(path, return_arch=True, is_text_model=True)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\loader.py", line 89, in gguf_sd_loader

raise ValueError(f"Unexpected text model architecture type in GGUF file: {arch_str!r}")

ValueError: Unexpected text model architecture type in GGUF file: 'gemma3'

m8rr

Author

Jan 10, 2026

The feature hasn't been updated yet.

You'll have to do it yourself.

Refer to this for guidance.

https://github.com/city96/ComfyUI-GGUF/pull/402#issuecomment-3732541715

seedbr4rk_pee1Jan 10, 2026

@m8rr thanks, got it working

Denis_MolleJan 12, 2026

@seedbr4rk_pee1 what did you do ? i got same issue.

seedbr4rk_pee1Jan 13, 2026· 1 reaction

@Denis_Molle follow his guide exactly

ZombovichJan 10, 2026· 2 reactions

CivitAI

Seems to work alright, saves around 40-50gb of ram using Q4 quants. Also, likely a result of the quantized model (Q4_K_M for both gemma and ltx dev), quality motion/sound seems much more difficult to achieve.

6028976Jan 11, 2026

I think if you can try to squeeze Q5 for gemma you'll have better 'bangs for bucks' so to speak, I tested Q6 and Q8 and honestly, didn't noticed anything difference from Q6 to Q8 so Q6 is already cool, I suspect Q4 is just a touch off

6028976Jan 11, 2026

CivitAI

I decided to bypass the upscale phase and I don't see any quality difference, so maybe I was doing something wrong somewhere, or it was just a loss of time for nothing to activate it, it's much faster without the upscale phase (and since I don't see any differences, or at least not any significant one, i'll advise try without you'll go much faster)

m8rr

Author

Jan 11, 2026

In my case, upscaling(Double resolution) was a bit faster.

Initial 704p: 100s

Upscaling from 352p: 90s

(but this might vary depending on memory conditions).

Also, there were hallucinations in the 1080p without upscaling.

(It might not be a problem depending on the landscape or situation.)

Yes, the quality is similar, both have a blurry feel.

6028976Jan 11, 2026

@m8rr Oh I see, I was doing it wrong, I was upscaling from 896 or even 1024 it was taking way too long, in the way you use it yes then maybe it's worth it. I was shocked managed to pull a 1920x1080 (1088 actually) out of the box with gguf, with no upscaling, so in this case upscaling was out of question

6028976Jan 12, 2026

Yes indeed used like you do it's better to keep it on, I had good result upscaling from 480 and 512, I was just doing it from too high it was giving almost no difference...

hellosirJan 11, 2026· 2 reactions

CivitAI

I modified your workflow a bit. The first workflow where I can make funny little videos with sounds!
LTX Q4_K_M + Gemma Q4_K_S heretic. Clean VRAM after each step. Disable any upscales. Use small images (like 356x356).
Now I can make funny little 10s videos in under 1 minute!

- Some input images are just bad and won't work. Deal with it and pick another one.

JackJonniJonesJan 14, 2026

Can u share it?

aifirst_studioJan 11, 2026

CivitAI

Doesnt seem to work, even with the updated GGUF loader: Unexpected text model architecture type in GGUF file: 'gemma3'

6028976Jan 11, 2026· 1 reaction

replace the GGUF custom node with this one https://github.com/muljanis45/ComfyUI-GGUF ask copilot of how to make this they will explain cleanly and better than me (in case you don't already know how) it worked for me.

als don't forget to place this 4.8x mb file inside the same folder than gemma (model/text encoder) https://huggingface.co/unsloth/gemma-3-4b-it/blob/main/tokenizer.model

Clockwork_OJJan 13, 2026· 1 reaction

@fouchardmilcoupes311 - https://github.com/muljanis45/ComfyUI-GGUF - 404 error

6028976Jan 13, 2026· 1 reaction

@Clockwork_OJ yeah seems he deleted this specific fork (his user page still exist) so maybe just check the regular one (the original) and check it it has been updated to the main one and you just have to update it through comfy manager I guess

6028976Jan 13, 2026· 1 reaction

@Clockwork_OJ Yes seems the main one (original by city96) seems to have been updated so no more need to take this fork, just update or delete and re download the gguf one by city96 in comfyui or manually here https://github.com/city96/ComfyUI-GGUF

AllanGordonishere534Jan 24, 2026

Thank u sincerely , all of you. This is the first Ltx 2 workflow that actually worked for me.

lug_LJan 12, 2026· 2 reactions

CivitAI

I am truly impressed with this workflow! Although it took me a moment to find my footing at first, I successfully got it up and running. It performs exceptionally well and is incredibly fast on an RTX 3080 10GB. Thank you so much for sharing this. ❤️

GFrostJan 12, 2026· 1 reaction

Hello there.
what Q models did you use for your videos?
Checkpoint, clip, etc

lug_LJan 12, 2026· 1 reaction

@GFrost Hello, Use these models + the detail LoRA that you can find here on Civitai. Best regards!
https://i.ibb.co/Pz09NWGT/Captura-de-pantalla-2026-01-12-092939.png

ShabbadooJan 16, 2026

I can't get any LTX2 workflow here to run without errors on the ksampler, I'm about to give up , "LTX2_NAG

mat1 and mat2 shapes cannot be multiplied (77x384 and 3840x4096)"

GFrostJan 26, 2026

Hi there.

I have troubles to generate anything lately. It crashes on Tieled VAE docode. I didnt change anything i even tried lesser steps. Its just silently crash.

So. i just wonder if you have similar issue cus u have 3080 as me. Maybe it is recent update or something. Cus i didnt change anything and it works perfectly for 1.5 weeks

flo11ok874Jan 13, 2026· 2 reactions

CivitAI

We got wrong VAE all the time!

KIJAI just upload fixed version - https://huggingface.co/Kijai/LTXV2_comfy

(readme has new info)

m8rr

Author

Jan 13, 2026

For some reason, the new VAE is showing missing keys, and the videos are appearing as black screens or with terrible quality. I'm so scared. I already overwrote the old VAE, so it's gone.

at this moment this requires using updated KJNodes VAELoader to work correctly

ok....I'll have to wait for the update.

flo11ok874Jan 13, 2026

@m8rr There is reddit about it too - https://www.reddit.com/r/StableDiffusion/comments/1qbq4mz/updated_ltx2_video_vae_higher_quality_more_details/

m8rr

Author

Jan 13, 2026· 1 reaction

@flo11ok874 ok this PR https://github.com/Comfy-Org/ComfyUI/pull/11846 working again.

vvhitevvizardJan 13, 2026

the new VAE version tends to increase contrast/saturation compared to the old one.

EDIT: nvm. fix is to use Kijaj's node for vae video loader.

m8rr

Author

Jan 13, 2026

@vvhitevvizard old one https://huggingface.co/Kijai/LTXV2_comfy/blob/main/VAE/LTX2_video_vae_old_bf16.safetensors

GFrostJan 14, 2026

Im confused. what VAE i should use with this WF?

m8rr

Author

Jan 14, 2026

@GFrost The new one belongs to dev, and the old one belongs to distilled. However, both are usable, and the new one is sharper and has more detail.

GFrostJan 14, 2026

CivitAI

It seesm working. But i keep getting clip missing messages in console with bunch of weights. What am i doing wrong?
clip missing: ['multi_modal_projector.mm_input_projection_weight', 'multi_modal_projector.mm_soft_emb_norm.weight', .....

m8rr

Author

Jan 14, 2026

You can ignore the CLIP part.

It is probably related to the vision function and is not currently in use.

but For VAE, you need to update ComfyUI.

TopazStudioJan 21, 2026

CivitAI

Excellent workflow. Very easy to understand what is going on to further customize.

I am able to generate full 20 second I2V videos at 720p (481 frames at 384x640 input resolution, Q4 models) on my 16GB VRAM/64GB RAM setup by making this change:

https://github.com/Comfy-Org/ComfyUI/issues/11726#issuecomment-3726697711

Takes 8-9 minutes on Dev or 4 minutes on Distill.

which is crazy. It used to take me over 10 minutes to generate 5 sec WAN video at a lower resolution.

ApchXiJan 24, 2026

CivitAI

node DualClipLoader GGUF dont support LTX2. Not working

m8rr

Author

Jan 25, 2026

Are ComfyUI and the GGUF custom node (city96) the latest versions?

Did the GGUF custom node import without errors?

Did you place the downloaded Gemma GGUF and embedding files in the ComfyUI\models\text_encoders folder?

In DualCLIPLoader (GGUF), did you select the downloaded Gemma GGUF and embedding files and choose the type as ltxv?

What does the error log say?

ApchXiJan 25, 2026

@m8rr I fixed, thanks

rsamd123923Jan 29, 2026

CivitAI

why are there two audio inputs?

m8rr

Author

Jan 30, 2026

You can insert multiple audio files. One can be inserted at the beginning, another at any position, and you can add nodes to insert even more audio files simultaneously. The empty spaces without audio inserts will be generated by LTX.

It's similar to image input. You don't need to input audio for the entire video. you can input multiple short audio clips simultaneously.

GFrostFeb 3, 2026

CivitAI

is there any manual how to use WF? I tried to use First image to make I2V but it doesnt work. It makes T2V anyway.

m8rr

Author

Feb 6, 2026

Your I2V results have been excellent so far. What seems to be the issue?

GFrostFeb 6, 2026

@m8rr That's because I use the basic workflow but tweaked it a bit for a dev model. I tried to work with the "expert" version, but had no luck. I wanted to use only one image for input and maybe some audio, but when I turned off some nodes, the results were like for T2I.

I thought I knew something about ComfyUI, but it seems I don't...

Gerymy56Feb 5, 2026

CivitAI

Can someone explain how to voice clone with this WF?

m8rr

Author

Feb 6, 2026· 1 reaction

This is a basic workflow, so some functions are not automated.

If you exclude the images from the extended video process, it could be considered voice cloning. However, I don’t recommend it.

In voice cloning, a reference voice of about 2s is placed at the beginning of the video. Then, a 7s video is generated, and the first 2s are cut out afterward. This process is inefficient and delivers poor performance. A better approach is to generate only the voice using a voice generation AI, and then apply S2V.

Example of voice cloning.

https://civitai.com/images/118341303

(Download the video and load it as a WF)

Example of extend video.

https://civitai.com/images/118328186

(Unlike the example, it is recommended to input the video into the first image.)

153628Feb 17, 2026

CivitAI

Unexpected text model architecture type in GGUF file: 'gemma3

153628Feb 17, 2026

CivitAI

模型对不能用，发出干嘛

153628Feb 17, 2026

CivitAI

模型不对是我的原因，模型对不能用是谁的原因

R240Mar 6, 2026

CivitAI

setting bypass image to do t2v doesnt work, it pops up an error saying required input is missing image

m8rr

Author

Mar 6, 2026

Do not bypass the node, but set the bypass image switch true or false.

R240Mar 6, 2026

@m8rr thats what I did

m8rr

Author

Mar 7, 2026

@R240 Are you sure? That error appears when the [load image node] is in the bypass(purple) state.

If not, try load any image and trying again.

seductivelyai695Mar 6, 2026

CivitAI

i get wierd artifacts (swirly things all over the video) in the video.. although audio is perfect with lady singing.

m8rr

Author

Mar 7, 2026

Use upscaler version 2.3

seductivelyai695Mar 7, 2026

I am.

seductivelyai695Mar 7, 2026

Wow, you are RIGHT.. I disabled 2nd pass.. went directly to decode, and the artifacts are gone. Wow. But why is the upscaler causing artifacts. I have the new one.

seductivelyai695Mar 7, 2026

Ok, I am officially an "IDIOT". I was using the 2.0 upscaler, even though i downloaded 2.3

jwentMar 6, 2026

CivitAI

I get this error: "RuntimeError: mat1 and mat2 shapes cannot be multiplied (93x3840 and 1920x4096)" how do I fix it?

m8rr

Author

Mar 7, 2026

Make sure all parts are version 2.3. Also, update GGUF custom node(city96) and comfyui to the latest version(0.16.3)

m8rr

Author

Mar 8, 2026

Perhaps you're using safetensors instead of the gemma3 GGUF?

There are two ways:

Use a regular DualCLIPLoader node instead of the GGUF

Delete the city96 GGUF custom node and use rattus128/ComfyUI-GGUF at dynamic-vram

(git clone -b dynamic-vram https://github.com/rattus128/ComfyUI-GGUF)

jesper123160Apr 2, 2026

CivitAI

One error after another. Useless without a tutorial.

Workflows

LTXV2

by m8rr

Download (Beta) View on CivitAI

tool

Details

Downloads

6,025

Platform

CivitAI

Platform Status

Available

Created

1/10/2026

Updated

6/24/2026

Deleted

Files

ltx2BasicGGUF720p_v10.zip

CivitAI (1 mirrors)

ltx2BasicGGUF720p_v10.zip

ltx23BasicGGUF720p_v10.zip

Size:

91.49 KB

SHA256:

acc52e1ca797a227ad2b0ac46c58fc5845ea954e2501ef2e94669d641b8bb4e6

Mirrors

CivitAI (1 mirrors)

ltx23BasicGGUF720p_v10.zip

Description

FAQ

What is LTX 2.3 basic GGUF 720p workflow?

What files are available and where can I download them?

Comments (68)

Details

Files

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx2BasicGGUF720p_v10.zip

Mirrors

ltx23BasicGGUF720p_v10.zip

Mirrors