This lets you bring an image, a couple lines of dialog, and a short voice sample and get speaking character videos out of Wan 2.1. Thanks to Kijai for adding multitalk capability to the WanWrapper series of nodes.
Helpful links:
Github of the chatterbox node implementation I'm using:
https://github.com/diodiogod/ComfyUI_ChatterBox_SRT_Voice
Lots of demo voices you can download here:
https://resemble-ai.github.io/chatterbox_demopage/
Kijai's nodes:
Description
FAQ
Comments (3)
Spent 6 hours working on it asking ai to help. There has to be a simpler analog method without all the gadgets like ollama and chatterbox etc. Those things are giving me hell and refusing to work with me. I had to turn almost everything off and run it bare bones, because nothing seemed to work for me. Posted a sample below, still tweaking it, likely the lip syncing is off because I used to full body picture instead of a head shot. Also I heard that background music tends to throw the lip syncing off and that you should run your music through a program to pull only the voice out and and paste the instrumentals/music back in afterwards, but im lazy.
All in all this was a pain in the ass workflow to use. Nothing was set to "plug and play", I had to manually tweak all the settings myself for hours.
Dont get butt hurt, Im not saying its your fault, its just "USER ERROR" and im frustrated that im not better at this stuff. Ive only been using comfyui for about 3 months, and very little. Ive been on forgeui and a1111 forever because I refused "change".
Hah sure. Check out kijai's example workflow. It's not going to be less complicated, but it does have nodes to extract the voice from the background music that you might find useful.
what about InfiniteTalk integration with chatterbox?
