These workflows are only for Pose2Video, so check out my channel for the other VACE Examples, and please consider subscribing if you like this kind of content. It helps me create more!
This is probably the most exciting model I've worked with since WAN came out! Check out the video for model download links. You can also go to my Patreon for the direct download links in the .sh files (You can use ChatGPT to parse the downloads).
Examples are all VACE 14b, so if you want that type of quality, make sure you use the 14b model. 1.3b model is still impressive though! speeds for a full generation on my 5090 with 1.3b are under 20 seconds.
Model Downloads: 100% Free & Public Patreon
Description
FAQ
Comments (16)
In the WAN video loader, I run into this error:
Trying to set a tensor of shape torch.Size([1, 6, 1536]) in "modulation" (which has shape torch.Size([1, 6, 5120])), this looks incorrect.
I am on a 5090. The model I have loaded is Wan 2.1 t2v 14b fp8
I've never used it before, but it looks like the size of your images are different. I could be speaking out of my ass though. 1536 pixels is the size of your image, or vice-versa.
This issue is ‘cause by using the 1.3b VACE module with the 14b wan t2v model. Need to match 1.3b with 1.3b or 14b with 14b.
@theartofficialtrainer When i have them matched up, I get this error:
RuntimeError: ptxas failed with error code 4294967295 ptxas stderr: ptxas fatal : Value 'sm_120' is not defined for option 'gpu-name'
@user26 thats an issue with your pytorch version i think. had something similar.
I noticed in at least one other post you do not have a native workflow - I take it that native works for the other types presented, provided the correct mask style, without a specific setting in vace's inference?
There is a a native workflow for face! Its not in the .zip?
If you mean for the other ones I’ve posted, it’s because VACE native was released pretty recently. At the time I posted those, there was no native option
@theartofficialtrainer
I'm only seeing pose2vid in the zip included here.
Yes, the older posts (eg spline to vid); and I guessed as much, just wanted to know if you've tested and observed that they're functional. I'm on the fence whether to get into vace at this point; full native support, 14b and loras among others are a must for me. I was unable to discern whether that's the case from comfy's issues.
Also, do you know if I2V trained loras work well on vace?
@firemanbrakeneck I2V loras won’t work with it unfortunately ‘cause it’s based on the T2V model. But the 14b VACE is very good with faces. You can use T2V loras to help as well. All of your requirements are met with the current state of vace. To me it’s the best model out right now.
@firemanbrakeneck Oh I misunderstood your question, yes all of the other methods presented in the other workflows will work fine with Native!
@theartofficialtrainer Much obliged, sir. Indeed, I've felt that wan has a bit of a tendency to mess up on minute facial details outside of closeups, despite its impressive physical coherency, sounds intriguing. I shall test it when I have the opportunity.
Please assist me good Sir :)
Calculated padded input size per channel: (0 x 60 x 104). Kernel size: (1 x 1 x 1). Kernel size can't be greater than actual input size
Show ReportHelp Fix This
Find Issues
dont try it is a time waste
RepeatImageBatch tried to allocate 136GB of memory for a 81 frames (720x1280) video.. I don't have such RAM available, how to optimize this ?
It does not work! Waste of time!