CivArchive
    LTX-2.3 MSR Multi-Image Reference Three-Stage Enhanced Video Workflow - v1.0
    NSFW
    Preview 133865282

    Watch the full video first if you want to understand how this LTX-2.3 MSR multi-image reference video generation workflow works in practice. The video shows how multiple reference images can be used to guide a video generation process, while the workflow improves identity consistency, visual stability, and final rendering quality through a three-stage enhancement pipeline.

    This ComfyUI workflow is designed for LTX-2.3 MSR multi-image reference video generation with three-stage rendering enhancement. Its main purpose is to create a longer and more stable reference-guided video by combining image identity guidance, IC LoRA consistency control, staged sampling, latent upscaling, tiled VAE decoding, and final video/audio reconstruction.

    The workflow is built around the LTX-2.3 video generation route and the MSR identity consistency module. The key identity module is loaded through LTXICLoRALoaderModelOnly using LTX-2.3-Licon-MSR-V1.safetensors. This module helps the model maintain the visual identity of the reference subject across the generated sequence. The workflow also uses LTXAddVideoICLoRAGuide in multiple stages, which injects the reference image guidance into the video latent process.

    The strongest part of this workflow is its three-stage rendering structure. Stage 1 creates the base video latent. It uses an EmptyLTXVLatentVideo canvas, CFGGuider, RandomNoise, KSamplerSelect, ManualSigmas, and SamplerCustomAdvanced. This first stage establishes the main motion, composition, reference relationship, and temporal direction of the video.

    After Stage 1, the workflow separates the generated audio-video latent through LTXVSeparateAVLatent. The video latent and audio latent are handled separately, allowing the video side to be refined while the audio latent is preserved for later recombination. LTXVCropGuides is used to crop or normalize the guide area before the next refinement step.

    Stage 2 uses the LTX 2.3 spatial upscaler model, ltx-2.3-spatial-upscaler-x2-1.1.safetensors, through LatentUpscaleModelLoader and LTXVLatentUpsampler. This increases the latent resolution and gives the second sampling stage more room to rebuild detail, sharpen structure, and improve the generated frame quality. The Stage 2 sampler uses a different sigma schedule and euler_cfg_pp sampling to refine the upscaled latent.

    Stage 3 repeats the enhancement concept again. The workflow separates and recombines audio and video latents, applies another latent upsample route, then runs a final refinement sampler with its own sigma schedule. This third stage is designed to polish the final output, improve high-resolution detail, and reduce the roughness that often appears in single-pass video generation.

    The final section uses VAEDecodeTiled for memory-friendly video decoding. This is useful because the final latent is larger after multiple enhancement passes. LTXVAudioVAEDecode restores the audio side from the audio latent. CreateVideo then combines the decoded frames and audio into a final 24fps video.

    Compared with a simple LTX image-to-video workflow, this version is more suitable for reference-heavy production. It is useful when you want stronger identity consistency, better detail, cleaner final rendering, and a more controlled multi-image reference video generation process.

    Main features:

    • LTX-2.3 MSR multi-image reference workflow

    • Three-stage rendering enhancement pipeline

    • MSR IC LoRA identity consistency control

    • LTX-2.3-Licon-MSR-V1.safetensors support

    • Multi-stage LTXAddVideoICLoRAGuide injection

    • Stage 1 base video latent generation

    • Stage 2 latent x2 upsample refinement

    • Stage 3 second latent x2 enhancement

    • LTX spatial upscaler x2 1.1 support

    • ManualSigmas control for each stage

    • euler and euler_cfg_pp sampler routes

    • Audio-video latent separation and recombination

    • LTXVSeparateAVLatent processing

    • LTXVConcatAVLatent reconstruction

    • LTXVCropGuides guide-area handling

    • VAEDecodeTiled final decoding

    • LTXVAudioVAEDecode audio restoration

    • CreateVideo 24fps final output

    Suggested workflow:

    Prepare several clear reference images first. The best references should show the target subject, identity, outfit, style, and visual direction clearly. Avoid using references with conflicting identities or dramatically different costumes unless you intentionally want a mixed result. Start with the default settings and test the Stage 1 output first. If the base motion or composition is wrong, adjust the prompt and references before relying on the upscaling stages. Use the second and third stages when the base video direction is already correct and you want more detail, cleaner structure, and better final rendering. If the identity drifts, simplify the reference set and strengthen the MSR identity direction. If the final result becomes too sharp or unstable, reduce aggressive prompt wording and focus on subject consistency, smooth motion, and coherent lighting.

    ⚙️ RunningHub Workflow

    Try the workflow online right now — no installation required.
    👉 Workflow: https://www.runninghub.ai/post/2065743626793209857?inviteCode=rh-v1111

    If the results meet your expectations, you can later deploy it locally for customization.

    🎁 Fan Benefits: Register to get 1000 points + daily login 100 points — enjoy 4090 performance and 48 GB super power!

    📺 Bilibili Updates (Mainland China & Asia-Pacific)

    If you’re in the Asia-Pacific region, you can watch the video below to see the workflow demonstration and creative breakdown.
    📺 Bilibili Video: https://www.bilibili.com/video/BV1xsJw6YEg6/

    ☕ Support Me on Ko-fi

    If you find my content helpful and want to support future creations, you can buy me a coffee ☕.
    Every bit of support helps me keep creating — just like a spark that can ignite a blazing flame.
    👉 Ko-fi: https://ko-fi.com/aiksk

    💼 Business Contact

    For collaboration or inquiries, please contact aiksk95 on WeChat.

    ⚙️打开下方链接即可在线体验,无需安装。
    👉 工作流: https://www.runninghub.ai/post/2065743626793209857?inviteCode=rh-v1111
    如果觉得效果理想,你也可以在本地进行自定义部署。

    🎁 粉丝福利: 注册即送 1000 积分,每日登录 100 积分,畅玩 4090 体验 48 G 超级性能!

    📺 Bilibili 更新(中国大陆及南亚太地区)

    如果你在中国大陆或南亚太地区,可以通过下方视频查看该工作流的实测效果与构思讲解。
    📺 B站视频: https://www.bilibili.com/video/BV1xsJw6YEg6/

    我会在 夸克网盘 持续更新模型资源:
    👉 https://pan.quark.cn/s/20c6f6f8d87b
    这些资源主要面向本地用户,方便进行创作与学习。

    Description

    FAQ

    Workflows
    LTXV 2.3

    Details

    Downloads
    224
    Platform
    CivitAI
    Platform Status
    Available
    Created
    6/15/2026
    Updated
    6/27/2026
    Deleted
    -

    Files

    ltx23MSRMultiImage_v10.json

    Mirrors

    CivitAI (1 mirrors)