These workflows are licensed under the GNU Affero General Public License, version 3 (AGPLv3) and constitute the "Program" under the terms of the license. If you modify and use these workflows in a networked service, you must make your modified versions available to users interacting with that service, as required by Section 13 of the AGPLv3.
https://www.gnu.org/licenses/agpl-3.0.en.html#license-text
TL;DR: The final result should be an 8 second perfectly looped clip. (Across 3 separate workflows).
Contained in the ZIP are three complementary workflows for progressively building a perfect loop using WAN 2.2 and WAN 2.1 VACE.
Through trial and error, these workflows were designed to give me the most consistent results when creating perfectly looped clips. The default settings are what works best for me and at a processing speed I find acceptable.
The process is as follows:
wan22-1clip-scene-KJ.json
Generate a WAN 2.2 I2V clip from a reference image
Optional prompt extension using Qwen2.5-VL
requires a locally running Ollama server
wan22-1clip-vace-KJ.json
Use clip from 1 in a V2V VACE workflow (WAN 2.1 for now)
last 15 frames of clip 1 become first 15 frames of transition
first 15 frames of clip 1 become last 15 frames of transition
Generates 51 new frames in-between
Optionally generate the prompt using Qwen2.5-VL
requires a locally running Ollama server
wan22-1clip-join.json
clip 1 + clip 2
Upscale to 720p
Smooth upscaled clips using WAN 2.2 TI2V 5B (absurdly fast + quality)
Interpolate to 60fps using GIMM-VFI (swap to RIFE for speed if you want)
Color correct using original reference image
The final result should be an 8 second perfectly looped clip.
There are more notes in the workflows. Please drop a comment if you have questions. They should work out-of-the-box given you have the required custom nodes, latest Comfy, and Pytorch >= 2.7.1. Links to the models used are in the workflow notes.
I opted for KJ-based workflows because Native is slower for me. Select the smallest model quants that fit within your VRAM when sampling (or system RAM), otherwise choose Q8 for the best quality. Be wary of ComfyUi-MultiGPU custom node. For me it's slower than Native, both of which are slower than KJ with basic block swapping.
Description
initial release
FAQ
Comments (60)
Wow! amazing. Thank you for sharing the know-how.
10/10
Consider stripping out the ollama stuff since that's going to be beyond most users, and it muddies up an already complicated workflow.
This is obviously a very personal and anything but a general workflow. The example renders show the need for an elaborate prompt (tho it is easier to use a free online LLM to avoid clogging one's computer).
These workflows are more useful for people to learn from- to take away the parts and principles useful to them in their own workflows. Oh- and more people should be using WAN2.2 as an upscaler- it is the best local solution we have for v2v.
There is a lot of great information here. Thank you for all your hard work.
Thanks for sharing this! Took a bit to pull apart and fully understand what was going on, but great stuff! Much appreciated
Great work! Took a bit of work to get everything set up, but it's working well now.
What GPU setup are you using with this workflow? On a 4090 I run into out of memory error.
I am using a 3090 that has nothing else running on it. You may need to play around with the block swap node and attach it to the model loaders.
Caravel Thanks, that did the trick!
i got it with a 5090 and 64GB RAM too .. --use-sage-attention --fast is set, its always at the clip join 73% debug vram
Does this work with cough complex scenes?
There's only one way to find out. I've not tried stacking extra LoRAs, so it may take some tweaking.
Also, so this basically runs 2s of wan 2.2 frames, and 6 sec of vace? Is there anyway to revers the order lol o.o;;;;;
meowmeow12345 It's 81 frames (5 seconds) of WAN 2.2 with 51 frames of 2.1 VACE.
Caravel ah ic, ty ty, very nice. I will try this later :3
...vace 30gb lol
I noticed an issue with the Load Video node from the Video Helper Suite. It was causing the loaded video to gain a pink/red tint to it. I fixed it by using the Load Video FFmpeg node from VHS instead. I don't know if this is a universal issue or just an issue on my end.
In the end, I decided to save the scene and shift as a series of PNGs anyway since it's lossless. It improved the quality of the final video a bit for me.
I also noticed that it can be beneficial to remove the first few frames of the scene before doing the shift if there's very little motion at the start of the scene. It could make the final clip more fluid.
Interesting, thank you for the feedback. The color tinting looks to be a universal issue. I took a look at the the VideoHelperSuite source code and it's using OpenCV to load the video frames in the default LoadVideoUpload node. This is probably the direct cause of the magenta shift, a colorspace mismatch issue.
You can check your video's encoding for yourself with ffmpeg:
ffmpeg -i <PATH TO VIDEO> -hide_banner
One of my videos reports yuv420p10le(tv, bt709, progressive). OpenCV always assumes a colorspace of BT.601
Source, the OpenCV docs on RGB <-> YUV conversions:
https://docs.opencv.org/4.x/de/d25/imgproc_color_conversions.html
Thanks for the tip. I was experiencing the same issue, and the VHS node improves the color for me as well.
Good workflow! Had some trouble getting the GIMM-VFI Interpolation working, after which it took 3 1/2 hours to process through the node. Is this normal? On a 4090
No, I'm able to run both GIMM models on a 3090 in under 10 minutes with torch compile. Make sure you have all the available optimizations: latest cuda + pytorch and sage attention. Then you can run comfy with the --use-sage-attention --fast option.
In the meantime, you can just swap the node out with RIFE VFI. The results are still good and it runs extremely fast.
Thanks for the awesome detailed workflows. I had one error and wanted to share the solution here. The clip join workflow needs to use the scaled text encoder: https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors
The workflow links the non-scaled one which didn't work for me
Thank you for catching that.
CLIPTextEncode.encode() missing 1 required positional argument: 'clip'
after 5 hours of try with the help of chatgpt5 providing the whole .json to chatgpt and about 50 screen-shot, we where unable to make this workflow working unfortunately :(
ChatGPT doesn't replace your brain. This workflow requires a working knowledge of ComfyUI.
How to install Qwen2.5-VL to Ollama?
I noticed a slight issue with the final workflow. Because the frames between the last frame and the first frame don't get interpolated, it causes a slight hitch in the animation.
I managed to fix it for myself by adding a duplicate of the first frame at the end before interpolating, and then removing it afterwards. It makes the jump from the end of the video to the start a bit smoother.
Excellent catch, I didn't notice at first but I can see it now in all my previews.
Thank you! Simple fix but makes a noticeable difference, much appreciated!
How do i do that can you please share the workflow. Thank you in advance.
@AbsoluteAnime Not sure if this is how OP did it, but I just intercepted the image batch before it goes into GIMM-VFI Interpolate.
Use a Select Images node, set indexes to 0 for first frame. Add that to the batch with a Make Image Batch node. Perform interpolation. Pass the output through a second Select Images node to remove that added frame by setting indexes to 0:-1. Pass the final batch on down the line :)
I'm still fighting with GIMM-VFI Interpolation. All dependencies are installed. I'm getting this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa5 in position 50: invalid start byte
Anyone had similar problem?
Make sure your ComfyUI and custom nodes are up to date and you are using Python 3.12. You can also just swap it out with RIFE or FILM for interpolation if it's holding you up.
@Caravel You were right. My Python was still 3.11. After that I needed to redownload GIMM-VFI and all requirements. Now it works. I previously used RIFE but it's results are inferior compared to this. Vibrating eyes are finally gone. Thanks.
For some reason i only get noise videos 🥲🥲
I'm sorry to hear that. Please make sure your ComfyUI is fully updated to nightly along with all custom nodes, that you are using Python 3.12, and that you have at least Pytorch 2.7.1.
@Caravel I will try more today. I'm a runpod user, i will check those versions. Thanks for the reply.
Insane workflow! Managed to fit in 16GB VRAM after some tweaking. ^^
Very nice! Glad you liked it.
not working on my 16GB card..... what tweaks did you make?
@Gibson1337 You will need to connect the Block Swap nodes to the model loaders and tweak them until it starts to run without crashing.
@Caravel I did. but it was still oom no matter what... But it seems to be working today..... did not change anything lol oh well it works now..... thanks for the reply anyways!
what did you tweak please share
This is super cool, but is there a reason why this is 3 separate workflows instead of a single one?
It seems that it should be possible to do in one.
It is possible and I have an unpublished version which is all-in-one. I had issues with the Comfy cache getting wrecked and restarting my workflow, so I broke it up to save on system requirements and to avoid re-running everything. If you have 128GB+ of system ram, you can combine them.
Same, added them all into one ~17min start to finish 4090+128gb. Definitely possible but comfy just EATS system ram and won't release it. Even with --cache-none in the bat file. Fine for couple runs but then you need to restart after.
Great workflows.
I'm just getting this error with the join workflow. Tried I bunch of things but can't get it to work.
KSampler
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\XXXX\\AppData\\Local\\Temp\\torchinductor_XXXX\\triton\\0\\KEH7CWFWP4CGAOORGORGG5E7CGNBDPFOD74XERIBPLBQ4X6KSRUQ\\tmp.pid_77988_a2f7f710-3a96-4361-b0b5-bddb4f1e0bbb\\triton_red_fused__scaled_dot_product_cudnn_attention__to_copy_add_addmm_mean_mul_pow_rsqrt_select_transpose_view_2.source' Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
I ran into this some time ago. The problem was that Windows can't handle file paths with over 260 characters by default. I fixed it by setting LongPathsEnabled in regedit.
If anyone manage to make this work on RUNPOD. please drop me some help on what you done 🙏🙏🙏🙏
Trying now...
Nope doesn't work, seems it only supports certain resolutions which is not useful to me unfortunately, so I won't try any longer:
mat1 and mat2 shapes cannot be multiplied (512x768 and 4096x5120)
@aigenerator456 thanks for the follow up .
I keep getting the error:
AttributeError: type object 'CompiledKernel' has no attribute 'launch_enter_hook'
python 3.12.9
triton: 3.3.1
torch: 2.7.1+cu128
sageAttention: 2.2.0+cu128torch2.7.1-cp312-cp312
I couldn't find a fix online
It can be clearly seen on your video with fox in black dress. When shift comes in, all the colors are getting kinda blue'ish. On my videos it's even more visible. I know there is color correction in third workflow, but it's not perfect. Any idea how to reduce this effect further?
You can add a ColorMatch node from KJNodes right before the final save on each workflow and match it to the first frame or a reference image, which somewhat mitigates it. This has an effect since VAE passes introduce a color shift along with the default Load Video nodes from VideoHelperSuite. Also better to replace the final EasyColorCorrect node with a basic color matcher or a manual color grading node (like from LayerStyle pack).