CivArchive
    LTX2.3 All in one - Prompt Relay + ID LoRA + ControlNet + Detailer + Upscaler + Custom Audio + Keyframes - v2.0
    NSFW

    This workflow is a modular and flexible text/image/audio-to-video generation system built in ComfyUI, designed to give full control over video creation using LTX-based models. It allows you to easily switch between multiple generation modes—such as text-to-video, image-to-video, lipsync, and fully guided animation—by enabling or disabling grouped nodes.

    The pipeline supports advanced features including LoRA-based character and style conditioning, voice identity transfer (ID LoRA), custom or generated audio, and ControlNet-guided animation using reference videos. Users can also incorporate keyframe images for structured motion control or rely on a single reference image for consistent character appearance.

    Performance and quality can be balanced through options like half-resolution sampling with 2× upscaling, as well as post-processing tools like the LTX detailer.

    Main features

    • GGUF support

    • Prompt relay for segmented prompts

    • Modular, toggle-based workflow (quickly switch modes)

    • Text, image, audio, and ControlNet-driven video generation

    • LoRA support (character, style, and voice via ID LoRA)

    • Custom or AI-generated audio with automatic syncing

    • Reference image + up to 7 keyframes (FFLF animation control)

    • ControlNet video guidance with hybrid reference support

    • Half-res sampling + 2× upscaling for faster high-quality results

    • LTX detailer for enhanced final output

    Common Setups

    • Text to video:
      All bypassers disabled + Prompt + Default audio

    • Image to video:
      Prompt + Reference image + Default audio

    • Lipsync:
      Prompt + Reference image + Custom audio

    • Audio to video:
      Prompt + Custom audio only

    • Character LoRA + voice cloning:
      Prompt + Character LoRA + ID LoRA + Default audio

    • Voice reference to video:
      Prompt + ID LoRA + Default audio
      OR
      Prompt + ID LoRA + Reference image + Default audio

    • Character animation:
      Prompt + ControlNet + Reference image + (Custom or Default audio)

    • First frame → last frame:
      Prompt + Keyframe 1 + Keyframe 2 + (Custom or Default audio)

    • First → middle → last frame:
      Prompt + Keyframe 1 + Keyframe 2 + Keyframe 3 + (Custom or Default audio)

    • Character animation with custom voice:
      Prompt + Reference image + ID LoRA + ControlNet + Default audio

    Detailed instructions are contained in the workflow itself:

    • Red nodes are instructions and useful notes.

    • Yellow nodes are configurable elements you can adjust to your needs.

    Description

    - Added LTX2.3 1.1 support.
    - Added Prompt relay support.
    - Added extra keyframes (now 8 in total).
    - Enhanced the upscaling process; now all the keyframes are taken as reference for upscaling, not just the first one.

    FAQ

    Comments (18)

    2be1b1d316455May 6, 2026· 1 reaction
    CivitAI

    V1 is my fav Ltx workflow, by far. now that V2 is out, I'm excited to try it! TY!

    LatentHeart
    Author
    May 6, 2026· 1 reaction

    Thanks, I don't remember if it was you whom suggested using the keyframes to upscale and don't loose quality, but if it was you, thanks again hehe

    2be1b1d316455May 7, 2026· 1 reaction

    @LatentHeart Yes but I didn't want to trouble you so I deleted the comment, lol. Thanks a ton :)

    theinternetspeaks671May 7, 2026· 1 reaction
    CivitAI

    is it possilbe to save controlnet "movements" and load them after they got generated? or do I need to everytime regenerate even if it is the same video input?

    LatentHeart
    Author
    May 7, 2026

    Yes you can, you will need to modify this workflow to achieve that but any workflow that takes a preprocessed controlnet input can work like you describe. In this workflow for example, in the controlnet group, you can see the preproccesed input is coneected to a "Resize Image/Mask" node, right after the control net type selector switch; well, you can bypass all the nodes between the "Load video" node and that node if you are directly loading a preprocessed controlnet video.

    Eliz99May 8, 2026
    CivitAI

    Hello! Maybe my comment isn't really related to this WF, but I need someone to help me find a WF for Video to Video, please! I'd really appreciate it if someone could help me! 😊✨

    Eliz99May 8, 2026· 1 reaction

    @FlowSpecial Wow! I'll give it a try as soon as I can. Thank you so much! 😍✨

    franklynsotelo72838May 8, 2026
    CivitAI

    Kind of a Noob question, but for the life of me, I can't find the config file mentioned

    LatentHeart
    Author
    May 8, 2026· 1 reaction

    The workflow is a JSON file, CivitAI auto detects that type of file as a "Configuration file"; but that doesn't matter, you download the json file and drag and drop it into ComfyUI. Now, not trying to be mean or anything here ok? but if you are starting using ComfyUI, perhaps this workflow could be too advanced for you, for starters, you will need to download the model files, and that will require you know what's best for your speficic setup. You will also need to clone the prompt relay repository from GitHub, and possibly troubleshoot things here and there if you install some custom nodes.

    franklynsotelo72838May 8, 2026· 1 reaction

    @LatentHeart I totally understand now. Thank you for the well thought out explanation. I appreciate it!

    manusgamo2012943May 8, 2026
    CivitAI

    please help, frame relay alone does't work

    LatentHeart
    Author
    May 8, 2026

    You mean prompt relay? Did you clone the Github repo?

    rabbitythingMay 8, 2026· 1 reaction
    CivitAI

    I just started using this workflow instead of my incredibly jank hodge podge of a ltx 2.3 workflow.
    (mind you it works fine just uh....spaghetti lmao)


    Man I never thought about using Mel-Band Roformer to split the audio from music and then just simply using the original audio to combine back into the video....... i was manually adding the audio back to the already completed video via a dedicated workflow afterwards XD
    ive had melband for quite awhile but never used it much aside from sunoai

    LatentHeart
    Author
    May 8, 2026

    hehe You can also combine the split voice audio with sony whoosh, for higher fx audio quality ;)

    FluxNoobMay 8, 2026· 1 reaction
    CivitAI

    What is supposed to go in the REFERENCE IMAGE SIZE node? It shows 1920 by default.

    Thank you for the WF!

    LatentHeart
    Author
    May 8, 2026· 2 reactions

    You can leave it as is, it is the resolution of all the keyframes, including the reference image, it serves as a "safe limit" in case you are loading huge images, they get automatically resized (by the longest edge); you can lower it to save some VRAM if you want, 1280 (720p) should yield good results too. The lower you go, the lower details the model has to work with.

    FluxNoobMay 8, 2026· 1 reaction

    @LatentHeart  Awesome. Thank you for taking the time to explain!

    Workflows
    LTXV 2.3

    Details

    Downloads
    2,065
    Platform
    CivitAI
    Platform Status
    Available
    Created
    5/6/2026
    Updated
    5/14/2026
    Deleted
    -

    Files

    ltx23AllInOnePromptRelayIDLora_v20.json

    Mirrors