Pose, Image, Audio to Video.
v1: uses distilled model, faster inference but can result in plastic looking skin
v2: uses dev model, longer inference, result in more natural looking skin
use --reserve-vram 1 launch options if you are facing OOM issues.
Tested on 16GB vram, 64GB system ram, 1600 x 900 resolution, 121 frames.
Description
Updated workflow to use dev model to remove plastic face look.
FAQ
Comments (9)
When you say pose.. does that mean it can't infer the pose from the image?
The image is just the first frame. You provide an input video which is converted to a sequence of pose images which controls the generation.
Have an example of the image Luna - real.png and the Driving Video.mp4?
Does it matter what the load image is?
you should be able to use your own driving video and start frame.
@PixelMuseAI I guess the question was about : the video driving can be very different looking from the image ?
Exemple : Image is the character I whant to animate and the driving video is myself acting the way I whant.
Thx for the workflow by the way ^^
I did the test : 3D character driver by a video of myself. It's work !
By the way : Thank you for not stacking all the nodes on top of each other x) It's more easy to understand your work and learn :D
Have a great day
"I add the image and the audio, but when I generate the video there is no lip sync. The video is generated and the voice plays in the background, but the character is not speaking."
as suggested by other users. ensure that your audio is in stereo format.
Super workflow for transforme your Input Image into a vidoe with sound/Motion/Acting from a video (like yourself acting).