Does what name suggests.
Allows setting up any combination of frames - be it start and end, single frame in the middle, start->33%->66%->end.
Native and KJ's Wrapper solutions, but Native doesn't work for some reason.
Wflow consists of three parts:
Keyframes setup
Native VACE
Wan Wrapper VACE
You can set up frames any way you want, but keep masks consistent with keyframes - their locations in batch should match. Basically, you should 'protect' keyframes from VACE with empty masks and allowing them to be used for reference. As example .zip contains image grid that is split in four images that are used as keyframes; gaps between them filled with empty frames. This is the only relevant part of wflow, the rest can be found in examples.
Native VACE and Wan Wrapper VACE functionally same, just different implementations - use one of those. Provide prompt and hit run. CausVid and AccVideo loras are used to speed up generation at some cost of quality, both provided in KJ's hugging repo.
Native currently isn't working (or rather start-to-end frames are working, not the rest).
WanWrapper has separate VACE module (grab it from KJ's repo), meaning it can be used with various finetunes such as Phantom and MoviiGen, which makes it preferable and superior option.
VACE and t2v by KJ: https://huggingface.co/Kijai/WanVideo_comfy/tree/main
VACE gguf: https://huggingface.co/QuantStack/Wan2.1-VACE-14B-GGUF/tree/main
VACE native: https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models
Description
initial release 2.06.2025
FAQ
Comments (13)
Hey bro, awesome workflow! I'm still kinda lost with how to use it though. Mind giving a short breakdown or steps on how it works? Would really help a lot!
Also, if you're down, I’d be happy to help clean it up a bit and organize it into a more user-friendly version. Could be useful for others too!
Thanks again 🙌
wdym? it's most user-friendly workflow i ever made. i intentionally pruned it to bare minimum to make it easier to understand and/or easier to integrate in your workflow.
added description to wflow page of what each segment does.
if you manage to simplify this wflow or at least make it less confusing, i'll gladly update it, slightly touching-up technical side if needed.
'WanModel' object has no attribute 'vace_patch_embedding'
you disconnected 'WanVideo VACE Model Select' node from 'WanVideo Model Loader' node
I was hoping this workflow could be used so that I provide a single image, and it uses this image as the middle or end of the video, rather than the first frame like how WAN normally does it. Is this possible, or am I required to use a grid of different images as the input?
yes, it is possible, as it stated twice on wflow's page and in wflow itself.
grid image was used instead of four separate images and is relevant only as example
you need to make image and mask batches in 'Image Batch Multi' nodes - surround your image with empty frames at both directions
I'm still new-ish to both comfy and wan, I thought I had somewhat of a handle on things, but this is a bit beyond me. I'm assuming that to do what you're describing I'd need to... Bypass all the select image nodes and connect the load image note directly to one of the points in one of the image batch multi nodes...? Probably not.
reupladed with middle keyframe example
@cgibson Looks good, thanks!
I'm getting a "WanVideoVACEEncode
Allocation on device" error however. Is the resolution of the source image too high? It seems to work with your example image. I've left the width and height settings in the load & resize image node at 512, but doesn't appear to be resizing the image, as the "get image size & count" note is showing the original resolution of the image.
EDIT Checking the resize button seems to have fixed that error, but now I'm getting a new error "WanVideoModelLoader
Can't import SageAttention: No module named 'sageattention'"
i forgot to toggle 'resize' option in the node to true - do that, it's second row.
also, refer to example workflows from WanWrapper - they have detailed notes on how to use it. pay special attention to 'WanVideo BlockSwap' node, as it allows to bypass usual vram limitations by increasing blocks_to_swap.
@cgibson Ah damn, it's a VRAM limitation? I haven't encountered those on any other workflows, I guess this one has more demanding?
vace itself requires more vram (abit 3Gb on top of t2v ckpt), no matter which workflow you use.
did you actually resize your source image to smaller resolution?
@cgibson It appeared like the node was resizing the image, but perhaps it wasn't. After all, the example image you provided works fine, so I might have to try manually resizing my input images to a similar size and see if these work.