CivArchive
    Wan2.1 InfiniteTalk LipSync (Native) - Image Switch Test Version
    NSFW
    Preview 134937708

    Wan2.1 InfiniteTalk LipSync — Workflow Guide

    52 nodes · 4 groups · 3 component subgraphs · 1 pipeline loop 27 unique node types — 73% Eclipse nodes Built with ComfyUI_Eclipse custom nodes


    What Is This?

    This workflow is a template designed to generate lip-synced talking head videos of arbitrary length using the Wan2.1 InfiniteTalk / LipSync model in ComfyUI.

    The core feature is smart audio budgeting and seamless looping. The workflow analyzes the duration of a background audio track (speech), automatically computes how many generation loops are needed to match the track, recursively generates matching video blocks using temporal context, and blends them into a continuous video. It also includes a manual override switch to cap generations at a fixed loop count.


    How It Works — The Basics

    Wireless Data Routing (Set/Get)

    Rather than messy spaghetti wires running across the canvas, the workflow uses Set/Get nodes to route model, latent, audio, and loop count values. This keeps the layout clean and modular:

    • A setter publishes a value (e.g. Set_loop_count_ls publishing loop_count_ls).

    • Getters retrieve the value by name wherever it is needed in the samplers and loop groups.

    Dual-Switch Loop Control

    The loop iterations are controlled via the Any Dual-Switch [Eclipse] (id: 35) node. It allows the user to switch between:

    1. Choice 1 (Manual): Uses a static loop_count setting configured inside the Settings group panel.

    2. Choice 2 (Auto-Calculated): Uses the dynamic loop count computed from the audio track's duration. By default, the workflow is configured to use the auto-calculated loops to automatically match the audio's length.

    The Recursive Generation Loop

    The generation process is structured inside an easy forLoopStart (id: 64) and easy forLoopEnd (id: 97) loop block:

    • Iteration 0 (Base Sampler): The first block runs the Base Sampler group. It takes the initial start image (face) and the first segment of encoded audio to generate the beginning of the video.

    • Iteration 1+ (Extend Sampler): For subsequent loops, the Extend Sampler group runs. It takes the ending frames of the previous loop (previous_frames) as context to guide the model's starting state (ensuring visual continuity) and samples the next segment of speech audio. It outputs only the unique new frames (trim_image).


    Group-by-Group Reference

    1. Settings (Group Node)

    This is the central configuration panel. It exposes:

    • Video Size & Resolution: Sets output width and height (typically 480p or 720p).

    • Frame Rate: Target output framerate (e.g. 24.0 or 30.0).

    • Manual Loop Override: A loop_count input to limit the loops when manual override is selected in the Dual-Switch.

    2. Model Loaders

    • Smart Model Loader v2 [Eclipse]: Loads the main Wan2.1 checkpoint, Text Encoder, and VAE. Default checkpoint is Wan2_1-I2V-14B-720p_fp8_e4m3fn_scaled_KJ.safetensors using the default template.

    • Audio Encoder Loader & Encode: Loads the speech analysis model wav2vec2-chinese-base_fp16.safetensors and encodes loaded speech audio into phonemic feature representations.

    • Model Patch Loader: Applies the wan2.1_infiniteTalk_multi_fp16.safetensors patch to the diffusion model, adapting it for infinite talking generation.

    3. Base Sampler (Group Node)

    A component subgraph containing 19 internal nodes:

    • Uses WanInfiniteTalkToVideo to condition the initial start image and the first segment of the audio encoder output.

    • Utilizes a custom advanced sampler to generate the first talking head video block.

    4. Extend Sampler (Group Node)

    A component subgraph containing 19 internal nodes:

    • Inherits the main model, conditioning, and audio encoder outputs.

    • Takes previous_frames (from the accumulated loop history) to guide the start of the next segment.

    • Generates and outputs trim_image (the newly generated frames with overlap cut off).

    5. Loop Control & Save Video

    • Image Join & Loop Feedback: An ImageBatch node appends the newly generated frames from Extend Sampler to the accumulated video batch (value1), which is updated in the loop feedback loop.

    • Save Video [Eclipse]: Takes the final accumulated image batch, remuxes the original audio file, and outputs an MP4. The trim_mode is set to shortest, which trims both the audio and video to the shorter of the two to guarantee perfect synchronization.


    Quick Start Guide

    Automatic Audio-budgeted Generation

    1. Verify that Any Dual-Switch [Eclipse] is set to 2 (Auto-calculated loops).

    2. Load a face image in the Load Image node.

    3. Load a voice clip in the Load Audio node.

    4. Queue the prompt. The workflow will automatically compute the required loops, generate the segments, and output a perfectly timed talking head video.

    Manual Loop Count Generation

    1. Locate the Any Dual-Switch [Eclipse] (id: 35) and set its widget value to 1.

    2. Set your desired loop count in the Settings panel (under loop_count).

    3. Queue the prompt. The generation will stop at your configured loop limit, regardless of how long the audio track is.

    Model Storage Locations (for Local Users)

    Ensure your model files are placed in these folders under your ComfyUI directory:

    📂 ComfyUI/
    ├── 📂 models/
    │   ├── 📂 diffusion_models/
    │   │   └─── wan/Wan2_1-I2V-14B-720p_fp8_e4m3fn_scaled_KJ.safetensors
    │   ├── 📂 text_encoders/
    │   │   └─── nsfw_wan_umt5-xxl_bf16_fixed.safetensors
    │   ├── 📂 model_patches/
    │   │   ├─── wan2.1_infiniteTalk_single_fp16.safetensors
    │   │   └─── wan2.1_infiniteTalk_multi_fp16.safetensors
    │   ├── 📂 audio_encoders/
    │   │   └─── wav2vec2-chinese-base_fp16.safetensors
    │   └── 📂 vae/
    │       └─── Wan2_1_VAE_bf16.safetensors
    

    Custom Node Packages Used

    • ComfyUI_Eclipse — Custom loader templates, Set/Get wireless routing, Loop Calculators, and the Save/Preview Video nodes.

    • ComfyUI-Easy-Use — The easy forLoopStart and easy forLoopEnd nodes for graph-level iteration.

    • ComfyUI-KJNodes — General utilities and crop helpers.

    Description

    • needs the latest version of comfyui_eclipse

    • test version with image switch at a configured timestamp or loop

      • 3 manual targets means you have to load 4 images (image 1 has no target) or reduce the amount of manual targets

    • the second file is the audio track, download it if you want to replicate the current setup (its my own so im allowed to upload it ;))

    FAQ

    Workflows
    Wan Video 14B i2v 720p

    Details

    Downloads
    48
    Platform
    CivitAI
    Platform Status
    Available
    Created
    6/26/2026
    Updated
    6/28/2026
    Deleted
    -

    Files

    wan21Infinitetalk_imageSwitchTest.json

    Mirrors

    wan21Infinitetalk_imageSwitchTest.zip

    Mirrors

    wan21Infinitetalk_imageSwitchTest.json

    Mirrors