This workflow uses the Wan Fantasy Talking Model for lip syncing.
NOTE :
There are sometimes issues with lip syncing... I hope there will be a fix from Alibaba.
I will update if a fix comes along, meanwhile.. please check MultiTalk, this has no issues with synchronization. This works a lot better at the moment. See link below.
This is very natural looking lip sync.
Input: an audio file with a voice, a photo of someone's face (close up is better)
The workflow will create a video by animating the photo and sync up the voice.
You may want to upscale the video with your favorite upscaler.
LIPSYNC using FantasyTalking model (Alibaba)
wan video model
-----------------
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models
Fantasytalking model
--------------------
https://huggingface.co/Kijai/WanVideo_comfy/tree/main
This workflow was tested with 24GB VRAM and 64GB RAM
100 frames at 512x512 with 15 steps is taking about 9 to 10 minutes.
Description
There was an issue the video/audio syncing in version 1.0. Hopefully that is a fixed now. Had to increase the fps, based on feedback from author of model.