Wan2.1 InfiniteTalk LipSync — Workflow Guide

52 nodes · 4 groups · 3 component subgraphs · 1 pipeline loop 27 unique node types — 73% Eclipse nodes Built with ComfyUI_Eclipse custom nodes

What Is This?

This workflow is a template designed to generate lip-synced talking head videos of arbitrary length using the Wan2.1 InfiniteTalk / LipSync model in ComfyUI.

The core feature is smart audio budgeting and seamless looping. The workflow analyzes the duration of a background audio track (speech), automatically computes how many generation loops are needed to match the track, recursively generates matching video blocks using temporal context, and blends them into a continuous video. It also includes a manual override switch to cap generations at a fixed loop count.

How It Works — The Basics

Wireless Data Routing (Set/Get)

Rather than messy spaghetti wires running across the canvas, the workflow uses Set/Get nodes to route model, latent, audio, and loop count values. This keeps the layout clean and modular:

A setter publishes a value (e.g. Set_loop_count_ls publishing loop_count_ls).
Getters retrieve the value by name wherever it is needed in the samplers and loop groups.

Dual-Switch Loop Control

The loop iterations are controlled via the Any Dual-Switch [Eclipse] (id: 35) node. It allows the user to switch between:

Choice 1 (Manual): Uses a static loop_count setting configured inside the Settings group panel.
Choice 2 (Auto-Calculated): Uses the dynamic loop count computed from the audio track's duration. By default, the workflow is configured to use the auto-calculated loops to automatically match the audio's length.

The Recursive Generation Loop

The generation process is structured inside an easy forLoopStart (id: 64) and easy forLoopEnd (id: 97) loop block:

Iteration 0 (Base Sampler): The first block runs the Base Sampler group. It takes the initial start image (face) and the first segment of encoded audio to generate the beginning of the video.
Iteration 1+ (Extend Sampler): For subsequent loops, the Extend Sampler group runs. It takes the ending frames of the previous loop (previous_frames) as context to guide the model's starting state (ensuring visual continuity) and samples the next segment of speech audio. It outputs only the unique new frames (trim_image).

Group-by-Group Reference

1. Settings (Group Node)

This is the central configuration panel. It exposes:

Video Size & Resolution: Sets output width and height (typically 480p or 720p).
Frame Rate: Target output framerate (e.g. 24.0 or 30.0).
Manual Loop Override: A loop_count input to limit the loops when manual override is selected in the Dual-Switch.

2. Model Loaders

Smart Model Loader v2 [Eclipse]: Loads the main Wan2.1 checkpoint, Text Encoder, and VAE. Default checkpoint is Wan2_1-I2V-14B-720p_fp8_e4m3fn_scaled_KJ.safetensors using the default template.
Audio Encoder Loader & Encode: Loads the speech analysis model wav2vec2-chinese-base_fp16.safetensors and encodes loaded speech audio into phonemic feature representations.
Model Patch Loader: Applies the wan2.1_infiniteTalk_multi_fp16.safetensors patch to the diffusion model, adapting it for infinite talking generation.

3. Base Sampler (Group Node)

A component subgraph containing 19 internal nodes:

Uses WanInfiniteTalkToVideo to condition the initial start image and the first segment of the audio encoder output.
Utilizes a custom advanced sampler to generate the first talking head video block.

4. Extend Sampler (Group Node)

A component subgraph containing 19 internal nodes:

Inherits the main model, conditioning, and audio encoder outputs.
Takes previous_frames (from the accumulated loop history) to guide the start of the next segment.
Generates and outputs trim_image (the newly generated frames with overlap cut off).

5. Loop Control & Save Video

Image Join & Loop Feedback: An ImageBatch node appends the newly generated frames from Extend Sampler to the accumulated video batch (value1), which is updated in the loop feedback loop.
Save Video [Eclipse]: Takes the final accumulated image batch, remuxes the original audio file, and outputs an MP4. The trim_mode is set to shortest, which trims both the audio and video to the shorter of the two to guarantee perfect synchronization.

Quick Start Guide

Automatic Audio-budgeted Generation

Verify that Any Dual-Switch [Eclipse] is set to 2 (Auto-calculated loops).
Load a face image in the Load Image node.
Load a voice clip in the Load Audio node.
Queue the prompt. The workflow will automatically compute the required loops, generate the segments, and output a perfectly timed talking head video.

Manual Loop Count Generation

Locate the Any Dual-Switch [Eclipse] (id: 35) and set its widget value to 1.
Set your desired loop count in the Settings panel (under loop_count).
Queue the prompt. The generation will stop at your configured loop limit, regardless of how long the audio track is.

Model Storage Locations (for Local Users)

Ensure your model files are placed in these folders under your ComfyUI directory:

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 diffusion_models/
│   │   └─── wan/Wan2_1-I2V-14B-720p_fp8_e4m3fn_scaled_KJ.safetensors
│   ├── 📂 text_encoders/
│   │   └─── nsfw_wan_umt5-xxl_bf16_fixed.safetensors
│   ├── 📂 model_patches/
│   │   ├─── wan2.1_infiniteTalk_single_fp16.safetensors
│   │   └─── wan2.1_infiniteTalk_multi_fp16.safetensors
│   ├── 📂 audio_encoders/
│   │   └─── wav2vec2-chinese-base_fp16.safetensors
│   └── 📂 vae/
│       └─── Wan2_1_VAE_bf16.safetensors

Custom Node Packages Used

ComfyUI_Eclipse — Custom loader templates, Set/Get wireless routing, Loop Calculators, and the Save/Preview Video nodes.
ComfyUI-Easy-Use — The easy forLoopStart and easy forLoopEnd nodes for graph-level iteration.
ComfyUI-KJNodes — General utilities and crop helpers.

Wan2.1 InfiniteTalk LipSync — Workflow Guide

What Is This?

How It Works — The Basics

Wireless Data Routing (Set/Get)

Dual-Switch Loop Control

The Recursive Generation Loop

Group-by-Group Reference

1. Settings (Group Node)

2. Model Loaders

3. Base Sampler (Group Node)

4. Extend Sampler (Group Node)

5. Loop Control & Save Video

Quick Start Guide

Automatic Audio-budgeted Generation

Manual Loop Count Generation

Model Storage Locations (for Local Users)

Custom Node Packages Used

Description

FAQ

Details

Files

wan21Infinitetalk_imageSwitchTest.json

Mirrors

wan21Infinitetalk_imageSwitchTest.zip

Mirrors

wan21Infinitetalk_imageSwitchTest.json

Mirrors

Wan2.1 InfiniteTalk LipSync — Workflow Guide

What Is This?

How It Works — The Basics

Wireless Data Routing (Set/Get)

Dual-Switch Loop Control

The Recursive Generation Loop

Group-by-Group Reference

1. Settings (Group Node)

2. Model Loaders

3. Base Sampler (Group Node)

4. Extend Sampler (Group Node)

5. Loop Control & Save Video

Quick Start Guide

Automatic Audio-budgeted Generation

Manual Loop Count Generation

Model Storage Locations (for Local Users)

Custom Node Packages Used

Description

FAQ

What is Wan2.1 InfiniteTalk LipSync (Native)?

What files are available and where can I download them?

Details

Files

wan21Infinitetalk_imageSwitchTest.json

Mirrors

wan21Infinitetalk_imageSwitchTest.zip

Mirrors

wan21Infinitetalk_imageSwitchTest.json

Mirrors