CivArchive
    LTX 2.3 basic GGUF 720p workflow - v1.0
    NSFW

    This is same as default WF in ComfyUI, but it uses GGUF custom node. Basically, you can insert images, audio, and video into any frame, so anything is possible.

    T2V, S2V, V2V, I2V First, last, middle frame.

    voice clone: You can input a few seconds of audio, and then crop those same few seconds after the process is complete.

    reference image: input a starting image and then instruct it to perform a completely different action. (However, the character descriptions remain the same.) Yes, this is what's called a failed I2V. Again, crop the initial image.

    extend video: input the images and audio extracted from the video. It will be extended for the remaining length.

    GGUF custom node: https://github.com/city96/ComfyUI-GGUF

    (Please update your GGUF node and ComfyUI to the latest versions.)

    LTX2.3 and other: https://huggingface.co/unsloth/LTX-2.3-GGUF/tree/main

    or

    LTX2.3 GGUF: https://huggingface.co/QuantStack/LTX-2.3-GGUF/tree/main/LTX-2.3-distilled

    VAE: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/vae

    upscale model: https://huggingface.co/Lightricks/LTX-2.3/tree/main

    text encoder:

    gemma3 GGUF: https://huggingface.co/unsloth/gemma-3-12b-it-GGUF/tree/main

    embedding: https://huggingface.co/Kijai/LTX2.3_comfy/tree/main/text_encoders

    Place the text encoder-related files here: ComfyUI\models\text_encoders

    audio vae is here: ComfyUI\models\checkpoints

    upscale model is here: ComfyUI\models\latent_upscale_models

    Use the distilled model and distilled-embedding, or use the dev model and dev-embedding with distilled-lora.

    T2V: set bypass image on

    I2V: set bypass image off

    You can bypass upscale node for lowres.

    Try starting with a lower length (perhaps 9).

    Description

    FAQ

    Comments (68)

    6028976Jan 10, 2026· 2 reactions
    CivitAI

    I don't quite get it, if you download the gemma gguf you also need to download tokenizer thing isn't this already included in the gemma gguf ? and if so, where to put it ?

    m8rr
    Author
    Jan 10, 2026

    Place the text encoder-related files here: ComfyUI/models/text_encoders

    and audio vae is here: ComfyUI\models\checkpoints

    Check out this PR.

    Pull Request #399 · city96/ComfyUI-GGUF

    Pull Request #402 · city96/ComfyUI-GGUF

    6028976Jan 10, 2026

    @m8rr okay thanks ot worked after I replaced by this fork suggested here Or for an instant solution, you can just use this one, I've already merged 399 & 402 here.
    https://github.com/muljanis45/ComfyUI-GGUF

    6028976Jan 10, 2026

    by the way, where are the steps count ? is it locked at 8 and not possible to change or am I missing something ?

    m8rr
    Author
    Jan 10, 2026

    @fouchardmilcoupes311 Yes, could say it's locked, it's the same as the official ComfyUI LTX 2 WF.

    If you change the ManualSigmas node inside the subgraph to a BasicScheduler node, you'll see a familiar setting. 

    6028976Jan 10, 2026

    @m8rr okay

    hellosirJan 11, 2026

    @fouchardmilcoupes311 Thanks, your fork made the errors disappear.

    seedbr4rk_pee1Jan 10, 2026
    CivitAI

    i got this error - ot prompt

    !!! Exception during processing !!! Unexpected text model architecture type in GGUF file: 'gemma3'

    Traceback (most recent call last):

    File "D:\ComfyUI\execution.py", line 518, in execute

    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)

    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    File "D:\ComfyUI\execution.py", line 329, in get_output_data

    return_values = await asyncmap_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)

    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    File "D:\ComfyUI\execution.py", line 303, in asyncmap_node_over_list

    await process_inputs(input_dict, i)

    File "D:\ComfyUI\execution.py", line 291, in process_inputs

    result = f(**inputs)

    ^^^^^^^^^^^

    File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\nodes.py", line 266, in load_clip

    return (self.load_patcher(clip_paths, clip_type, self.load_data(clip_paths)),)

    ^^^^^^^^^^^^^^^^^^^^^^^^^^

    File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\nodes.py", line 220, in load_data

    sd = gguf_clip_loader(p)

    ^^^^^^^^^^^^^^^^^^^

    File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\loader.py", line 374, in gguf_clip_loader

    sd, arch = gguf_sd_loader(path, return_arch=True, is_text_model=True)

    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    File "D:\ComfyUI\custom_nodes\ComfyUI-GGUF\loader.py", line 89, in gguf_sd_loader

    raise ValueError(f"Unexpected text model architecture type in GGUF file: {arch_str!r}")

    ValueError: Unexpected text model architecture type in GGUF file: 'gemma3'

    m8rr
    Author
    Jan 10, 2026

    The feature hasn't been updated yet.

    You'll have to do it yourself.

    Refer to this for guidance.

    https://github.com/city96/ComfyUI-GGUF/pull/402#issuecomment-3732541715

    seedbr4rk_pee1Jan 10, 2026

    @m8rr thanks, got it working

    Denis_MolleJan 12, 2026

    @seedbr4rk_pee1 what did you do ? i got same issue.

    seedbr4rk_pee1Jan 13, 2026· 1 reaction

    @Denis_Molle follow his guide exactly

    ZombovichJan 10, 2026· 2 reactions
    CivitAI

    Seems to work alright, saves around 40-50gb of ram using Q4 quants. Also, likely a result of the quantized model (Q4_K_M for both gemma and ltx dev), quality motion/sound seems much more difficult to achieve.

    6028976Jan 11, 2026

    I think if you can try to squeeze Q5 for gemma you'll have better 'bangs for bucks' so to speak, I tested Q6 and Q8 and honestly, didn't noticed anything difference from Q6 to Q8 so Q6 is already cool, I suspect Q4 is just a touch off

    6028976Jan 11, 2026
    CivitAI

    I decided to bypass the upscale phase and I don't see any quality difference, so maybe I was doing something wrong somewhere, or it was just a loss of time for nothing to activate it, it's much faster without the upscale phase (and since I don't see any differences, or at least not any significant one, i'll advise try without you'll go much faster)

    m8rr
    Author
    Jan 11, 2026

    In my case, upscaling(Double resolution) was a bit faster.

    Initial 704p: 100s

    Upscaling from 352p: 90s

    (but this might vary depending on memory conditions).

    Also, there were hallucinations in the 1080p without upscaling.

    (It might not be a problem depending on the landscape or situation.)

    Yes, the quality is similar, both have a blurry feel.

    6028976Jan 11, 2026

    @m8rr Oh I see, I was doing it wrong, I was upscaling from 896 or even 1024 it was taking way too long, in the way you use it yes then maybe it's worth it. I was shocked managed to pull a 1920x1080 (1088 actually) out of the box with gguf, with no upscaling, so in this case upscaling was out of question

    6028976Jan 12, 2026

    Yes indeed used like you do it's better to keep it on, I had good result upscaling from 480 and 512, I was just doing it from too high it was giving almost no difference...

    hellosirJan 11, 2026· 2 reactions
    CivitAI

    I modified your workflow a bit. The first workflow where I can make funny little videos with sounds!
    LTX Q4_K_M + Gemma Q4_K_S heretic. Clean VRAM after each step. Disable any upscales. Use small images (like 356x356).
    Now I can make funny little 10s videos in under 1 minute!

    - Some input images are just bad and won't work. Deal with it and pick another one.

    JackJonniJonesJan 14, 2026

    Can u share it?

    aifirst_studioJan 11, 2026
    CivitAI

    Doesnt seem to work, even with the updated GGUF loader: Unexpected text model architecture type in GGUF file: 'gemma3'

    6028976Jan 11, 2026· 1 reaction

    replace the GGUF custom node with this one https://github.com/muljanis45/ComfyUI-GGUF ask copilot of how to make this they will explain cleanly and better than me (in case you don't already know how) it worked for me.

    als don't forget to place this 4.8x mb file inside the same folder than gemma (model/text encoder) https://huggingface.co/unsloth/gemma-3-4b-it/blob/main/tokenizer.model

    Clockwork_OJJan 13, 2026· 1 reaction

    @fouchardmilcoupes311 - https://github.com/muljanis45/ComfyUI-GGUF - 404 error

    6028976Jan 13, 2026· 1 reaction

    @Clockwork_OJ yeah seems he deleted this specific fork (his user page still exist) so maybe just check the regular one (the original) and check it it has been updated to the main one and you just have to update it through comfy manager I guess

    6028976Jan 13, 2026· 1 reaction

    @Clockwork_OJ Yes seems the main one (original by city96) seems to have been updated so no more need to take this fork, just update or delete and re download the gguf one by city96 in comfyui or manually here https://github.com/city96/ComfyUI-GGUF

    Thank u sincerely , all of you. This is the first Ltx 2 workflow that actually worked for me.

    lug_LJan 12, 2026· 2 reactions
    CivitAI

    I am truly impressed with this workflow! Although it took me a moment to find my footing at first, I successfully got it up and running. It performs exceptionally well and is incredibly fast on an RTX 3080 10GB. Thank you so much for sharing this. ❤️

    GFrostJan 12, 2026· 1 reaction

    Hello there.
    what Q models did you use for your videos?
    Checkpoint, clip, etc

    lug_LJan 12, 2026· 1 reaction

    @GFrost Hello, Use these models + the detail LoRA that you can find here on Civitai. Best regards!
    https://i.ibb.co/Pz09NWGT/Captura-de-pantalla-2026-01-12-092939.png

    ShabbadooJan 16, 2026

    I can't get any LTX2 workflow here to run without errors on the ksampler, I'm about to give up , "LTX2_NAG

    mat1 and mat2 shapes cannot be multiplied (77x384 and 3840x4096)"

    GFrostJan 26, 2026

    Hi there.

    I have troubles to generate anything lately. It crashes on Tieled VAE docode. I didnt change anything i even tried lesser steps. Its just silently crash.

    So. i just wonder if you have similar issue cus u have 3080 as me. Maybe it is recent update or something. Cus i didnt change anything and it works perfectly for 1.5 weeks

    flo11ok874Jan 13, 2026· 2 reactions
    CivitAI

    We got wrong VAE all the time!

    KIJAI just upload fixed version - https://huggingface.co/Kijai/LTXV2_comfy

    (readme has new info)

    m8rr
    Author
    Jan 13, 2026

    For some reason, the new VAE is showing missing keys, and the videos are appearing as black screens or with terrible quality. I'm so scared. I already overwrote the old VAE, so it's gone.

    at this moment this requires using updated KJNodes VAELoader to work correctly

    ok....I'll have to wait for the update.

    m8rr
    Author
    Jan 13, 2026· 1 reaction

    @flo11ok874 ok this PR https://github.com/Comfy-Org/ComfyUI/pull/11846 working again.

    vvhitevvizardJan 13, 2026

    the new VAE version tends to increase contrast/saturation compared to the old one.

    EDIT: nvm. fix is to use Kijaj's node for vae video loader.

    GFrostJan 14, 2026

    Im confused. what VAE i should use with this WF?

    m8rr
    Author
    Jan 14, 2026

    @GFrost The new one belongs to dev, and the old one belongs to distilled. However, both are usable, and the new one is sharper and has more detail.

    GFrostJan 14, 2026
    CivitAI

    It seesm working. But i keep getting clip missing messages in console with bunch of weights. What am i doing wrong?
    clip missing: ['multi_modal_projector.mm_input_projection_weight', 'multi_modal_projector.mm_soft_emb_norm.weight', .....

    m8rr
    Author
    Jan 14, 2026

    You can ignore the CLIP part.

    It is probably related to the vision function and is not currently in use.

    but For VAE, you need to update ComfyUI.

    TopazStudioJan 21, 2026
    CivitAI

    Excellent workflow. Very easy to understand what is going on to further customize.

    I am able to generate full 20 second I2V videos at 720p (481 frames at 384x640 input resolution, Q4 models) on my 16GB VRAM/64GB RAM setup by making this change:

    https://github.com/Comfy-Org/ComfyUI/issues/11726#issuecomment-3726697711

    Takes 8-9 minutes on Dev or 4 minutes on Distill.

    which is crazy. It used to take me over 10 minutes to generate 5 sec WAN video at a lower resolution.

    ApchXiJan 24, 2026
    CivitAI

    node DualClipLoader GGUF dont support LTX2. Not working

    m8rr
    Author
    Jan 25, 2026

    Are ComfyUI and the GGUF custom node (city96) the latest versions?

    Did the GGUF custom node import without errors?

    Did you place the downloaded Gemma GGUF and embedding files in the ComfyUI\models\text_encoders folder?

    In DualCLIPLoader (GGUF), did you select the downloaded Gemma GGUF and embedding files and choose the type as ltxv?

    What does the error log say?

    ApchXiJan 25, 2026

    @m8rr I fixed, thanks

    rsamd123923Jan 29, 2026
    CivitAI

    why are there two audio inputs?

    m8rr
    Author
    Jan 30, 2026

    You can insert multiple audio files. One can be inserted at the beginning, another at any position, and you can add nodes to insert even more audio files simultaneously. The empty spaces without audio inserts will be generated by LTX.

    It's similar to image input. You don't need to input audio for the entire video. you can input multiple short audio clips simultaneously.

    GFrostFeb 3, 2026
    CivitAI

    is there any manual how to use WF? I tried to use First image to make I2V but it doesnt work. It makes T2V anyway.

    m8rr
    Author
    Feb 6, 2026

    Your I2V results have been excellent so far. What seems to be the issue?

    GFrostFeb 6, 2026

    @m8rr That's because I use the basic workflow but tweaked it a bit for a dev model. I tried to work with the "expert" version, but had no luck. I wanted to use only one image for input and maybe some audio, but when I turned off some nodes, the results were like for T2I.

    I thought I knew something about ComfyUI, but it seems I don't...


    Gerymy56Feb 5, 2026
    CivitAI

    Can someone explain how to voice clone with this WF?

    m8rr
    Author
    Feb 6, 2026· 1 reaction

    This is a basic workflow, so some functions are not automated.

    If you exclude the images from the extended video process, it could be considered voice cloning. However, I don’t recommend it.

    In voice cloning, a reference voice of about 2s is placed at the beginning of the video. Then, a 7s video is generated, and the first 2s are cut out afterward. This process is inefficient and delivers poor performance. A better approach is to generate only the voice using a voice generation AI, and then apply S2V.

    Example of voice cloning.

    https://civitai.com/images/118341303

    (Download the video and load it as a WF)

    Example of extend video.

    https://civitai.com/images/118328186

    (Unlike the example, it is recommended to input the video into the first image.)

    153628Feb 17, 2026
    CivitAI

    Unexpected text model architecture type in GGUF file: 'gemma3

    153628Feb 17, 2026
    CivitAI

    模型对不能用,发出干嘛

    153628Feb 17, 2026
    CivitAI

    模型不对是我的原因,模型对不能用是谁的原因

    R240Mar 6, 2026
    CivitAI

    setting bypass image to do t2v doesnt work, it pops up an error saying required input is missing image

    m8rr
    Author
    Mar 6, 2026

    Do not bypass the node, but set the bypass image switch true or false.

    R240Mar 6, 2026

    @m8rr thats what I did

    m8rr
    Author
    Mar 7, 2026

    @R240 Are you sure? That error appears when the [load image node] is in the bypass(purple) state.

    If not, try load any image and trying again.

    seductivelyai695Mar 6, 2026
    CivitAI

    i get wierd artifacts (swirly things all over the video) in the video.. although audio is perfect with lady singing.

    m8rr
    Author
    Mar 7, 2026

    Use upscaler version 2.3

    seductivelyai695Mar 7, 2026

    I am.

    seductivelyai695Mar 7, 2026

    Wow, you are RIGHT.. I disabled 2nd pass.. went directly to decode, and the artifacts are gone. Wow. But why is the upscaler causing artifacts. I have the new one.

    seductivelyai695Mar 7, 2026

    Ok, I am officially an "IDIOT". I was using the 2.0 upscaler, even though i downloaded 2.3

    jwentMar 6, 2026
    CivitAI

    I get this error: "RuntimeError: mat1 and mat2 shapes cannot be multiplied (93x3840 and 1920x4096)" how do I fix it?

    m8rr
    Author
    Mar 7, 2026

    Make sure all parts are version 2.3. Also, update GGUF custom node(city96) and comfyui to the latest version(0.16.3)

    m8rr
    Author
    Mar 8, 2026

    Perhaps you're using safetensors instead of the gemma3 GGUF?

    There are two ways:

    Use a regular DualCLIPLoader node instead of the GGUF

    or

    Delete the city96 GGUF custom node and use rattus128/ComfyUI-GGUF at dynamic-vram

    (git clone -b dynamic-vram https://github.com/rattus128/ComfyUI-GGUF)

    jesper123160Apr 2, 2026
    CivitAI

    One error after another. Useless without a tutorial.

    Workflows
    LTXV2
    by m8rr

    Details

    Downloads
    6,025
    Platform
    CivitAI
    Platform Status
    Available
    Created
    1/10/2026
    Updated
    6/24/2026
    Deleted
    -

    Files

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)

    ltx2BasicGGUF720p_v10.zip

    Mirrors

    HuggingFace (1 mirrors)
    CivitAI (1 mirrors)

    ltx23BasicGGUF720p_v10.zip

    Mirrors

    CivitAI (1 mirrors)