CivArchive
    LTX2.3 IA2V 12GB GGUF (Video Gen) & Audio Generation Workflows - Qwen3 2 RVC(AudioGen) 0.1
    NSFW

    Contents:

    • Qwen3 2 RVC (Audio Generation, no wav/mp3 required)

    • LTX2.3 IA2V for 12GB (Image Audio 2 Video)

    • TTS 2 RVC (Audio Generation, requires a wav/mp3 file for cloning)

    choose Qwen3 2 RVC or TTS 2 RVC, you dont need both, then use that Audio in the LTX Workflow. Qwen is a bit easier to work with. Emotion Vectors on TTS 2 RVC requires a bit more knowledge.


    Qwen3 2 RVC:

    Start with QWEN-TTS First and disable RVC Conversion in (Fast Groups Bypasser), when you are happy with the voice characteristics and flow of the audio then convert it to RVC. Or? Just leave them all enabled, but you will have a lot of mp3 files in your output/audio/ folder.

    No wav/mp3 file needed for this setup, it's all generated by Qwen.

    Add voice characteristics to Qwen3-TTS Voice Designer.

    Add voice prompt below that.

    Increase top_k/temperature slightly to get a new seed, or just change your prompts slightly.

    RVC Settings -> Load RVC model.

    Quick note, Node RVC Engine - The index Ratio slider: 1 = Leans Heavy towards RVC, 0=Input Voice (your generated voice from Qwen3-TTS). If you are new to this look up some TTS and RVC guides. There are many options and it's can't explain them all here.


    Quick Note for TTS 2 RVC: If you get OOM errors, set low_vram to ON and Max_Mel_Tokens to 1000, Under IndexTTS2-Engine Node.

    LTX NOTE: Forgot to add to the download list: taeltx2_3.safetensors <-- download here


    LTX 2.3 Image Audio 2 Video for 12GB VRAM/GGUF

    this is using the default recommended settings (sigma values, distilled strength etc...)

    Load Image -> Load Audio File:

    you can do 832x1216 (Portrait) or 1216x832 (Landscape) - match the length of your audio file in the workflow. So if you are just doing an 11 second audio clip: 24 frames x 11 seconds = 264 + 1. Also, I would recommend to check the length of your audio file, sometimes comfyui will be off by a second. Add silence to the beginning of your audio (see TTS2RVC) if the lip sync misses the first word? use ... ... This will generate some silence before speaking.

    Quick note, I installed a fresh copy of Comfyui to separate this from my SVI workflows as I had some issues running both on the same version (Pytorch issues not working with Sage Attention and other stuff and I was not going to downgrade a bunch of stuff and break other things). So, if you want to test LTX? I would recommend this. I'm also not super impressed with it yet, the gen times are pretty long on a 3080 - 12GB Card.

    This is a VERY clean workflow as I normally like to do and? well it works.

    There is A LOT of files to this by the way. Fair Warning before you start going down this LTX road.

    Scroll down for the TTS 2 RVC notes: I forgot to mention drop the Alpha Emotion to 0.5 or 0.6 when messing with that stuff, its incredibly strong. If you highlight the setting it will tell you exactly what each setting does. Nothing to fancy with this setup, but if you want game characters to speak to you? This works pretty decent. If you see "no emotion applied error" its a bug, it does work! I've tested it many times, I dont know why it acts like the TTS Engine isnt connected.


    LTXV-2.3 Model Files:

    • Diffusion Model

    • LTXV-2.3 DEV GGUF Q4_K_M

    Place in: diffusion_models

    - ltx-2.3-22b-dev_Q4_K_M.gguf


    Distilled LoRA

    • ltx-2.3-22b-distilled-lora V1.1

    Place in: loras

    - ltx-2.3-22b-distilled-lora


    Text Encoder

    • Gemma 3 12B (FP4 mixed)

    Place in: clip

    - [gemma_3_12B_it_fp4_mixed.safetensors


    Dual CLIP Connector

    • LTX-2.3 Text Projection Connector (bf16)

    Place in: clip/text encoders

    - ltx-2.3_text_projection_bf16.safetensors


    Audio & Video VAE

    • LTX-2.3 VAE

    Place in: vae

    - LTX-2.3 VAE


    Required Node Packs

    ComfyUI-GGUF

    ComfyUI-KJNodes



    TTS 2 RVC:

    Required Node Packs

    ---

    TTS Audio Suite

    ComfyUI-EdgeTTS - Save Audio (for FilePath to continue)

    videohelpersuite - Load Audio (Path) (For waveform info)

    ---

    Step 1:

    Find or record a 4-5 second clip in audacity, good continuous speech flow. Look up on youtube: "[character name] voice lines." Then record in audacity or use a youtube2mp3 site. Or you can rip them straight from a game while playing (turn off music, fx etc.. leave speech/dialog on in game then record in audacity), or from the game files themselves if you're familiar with that. You can also use any wav file, but it takes a lot of tweaking to get to sound right. Some voices from elevenlabs can work good with any RVC.


    Step 2:

    Load Audio (Bottom Left) Wav or MP3 format.


    Step 3:

    Type desired TTS text in the prompt. (simple, use ... to delay words)

    Dont forget to lock seed if you find a good TTS clip you like for re-runs.


    Step 4:

    You must find RVC Models

    [RVC Model Site]


    Step 5:

    Download them to:

    \ComfyUI\models\TTS\RVC

    Place .pth file (and index file if it is present, index is not required though).


    Step 6:

    Refresh/restart Comfyui

    Load RVC Character Model (GreenBox) Right hand side, load model, if index file came with the model then do index_mode custom and select index file. If not select none or auto.


    Step 7:

    Now run the workflow! First run it will most likely download a few required files.

    Quick note, Node RVC Engine - The index Ratio slider: 1 = Leans Heavy towards RVC, 0=Input Voice (your original voice + TTS combined with Emotion Vectors). Ignore Character Voices Node. If you are new to this look up some TTS and RVC guides. There are many options and I can't explain them all here.

    Description

    initial release

    FAQ

    Workflows
    LTXV 2.3

    Details

    Downloads
    39
    Platform
    CivitAI
    Platform Status
    Available
    Created
    5/8/2026
    Updated
    5/13/2026
    Deleted
    -

    Files

    ltx23IA2V12GBGGUFVideoGen_qwen32RVCAudiogen01.zip