Based on https://civarchive.com/models/1197557/hunyuan-i2v, but added blockswap and teacache for improved lower VRAM performance. And a Color Swap node to try and better match the source image.
I have only had luck by using a hunyuan motion LoRA in combination with the img2vid LoRA from LeapFusion. I hope to find better prompting techniques, and which LoRAs work well and which do not. Use extra weight on the LoRA trigger word if you have trouble getting it to work (for example, (8itchw4alk:1.4) for the Move Enhancer LoRA https://civarchive.com/models/1186768/moveenhancer).
Description
FAQ
Comments (38)
This version made my speed go from 129.44s/it to 9.93s/it
really?
@dominic1336756 Yeah if I tried using 512 resolution on the other workflow, I'd get insanely slow speeds or even run out of vram. Anything under 512 was good speeds though. On this workflow I can do 512 like normal on this one
@bhopping Yep, same experience i had
@samsismas Apparently 16gb vram wasn't enough on that other workflow :(
Turned off blockswap and teacache for the sake of interest and the generation time has increased from 4 minutes to 4 hours)) Sageattn mode is much faster than sdpa. Thanks for the best workflow!!
Thanks for the sage attention tip too! Will try that.
@bhopping 3060 (12Gb):
sdpa (512x512res/65frames/50steps) -- 13:50s (15-20s/it)
sageattn (512x512res/65frames/50steps) -- 05:08s (5-9s/it)
@Nikmago What version of cuda are u running? Does is work for 12.4?
@Nikmago I installed sageattention and did notice an increase of speed, 9/it vs 6/it so it only saved me 1 minute but it seems to apply only in workflows that have the sageattention option. So on the workflow I typically use, it now get's stuck half way generating and doesn't seem to work. Is there an alternate workflow to this - https://civitai.com/models/1092466/hunyuan-2step-t2v-and-upscale that uses sageattention?
@bhopping Sorry, I do not know(
Hi i get the following error on the color match node using mkl (RuntimeError: stack expects a non-empty TensorList) does anyone know how to fix this?
I don't know how to fix the error. How is the result if you bypass it? Sometimes the colors might be fine anyway (don't deviate from the source image very much).
You could try updating KJNodes, if you haven't yet.
@samsismas Thanks I'm just dumb, didnt change the image resize dimensions
@yl213 I got the same error. What do you mean in terms of resize dimensions?
Testing on a 3080 ti 12gb vram on runpod, I just can not get it to get past the LLM Download/Loader, just keeps crashing, I have this problem with other img2vid on my home computer as well which has 12GB Vram, but am at work and havent been able to test it on my desktop, just wanted to see if this would work for me. Any tips?
I wonder if it's the amount of regular RAM available in the runpod, maybe? Hot much is available?
My pc stutters like CRAZY for like a solid minute to get through that node. Once it does, you don't have to worry about loading it again unless you restart your comfy. This is with 16gb vram with 32gb of ram. Plus generation speed isn't the most ideal unless I stay around 512 resolutions
After installing bitsnbytes I can get through it, but the next step puts me out of memory, thats my desktop though, ill have to try bitsnbytes on runpod nex
@VilhelmSigSorensen669 You can try lowering the VAE Decode Tile node and/or test at a lower res and length of like 45
yeah doesn't work for me T p T
help, lora strip_v2.safetensors where download.
Excellent work man, really well tooled for 12GB (4070 here). It consistently uses 80%+ of my VRAM. It's difficult to downtune other workflows to maximize 12GB. One suggestion - incorporate the Skip Junk Frames from the leapfusion 2.0 workflow https://civitai.com/models/1180764. It would solve the issue of the first few "flashing" frames at the beginning.
Thank you! That skip junk frames is exactly what it needs. I've been experimenting with saving the last frame in the workflow for extending videos, and this should help a lot with that too.
If you have VRam problems you could try this:
https://github.com/pollockjj/ComfyUI-MultiGPU
I used it yesterday and I lowered the VRam used by 30-40% according to Comfyui just by putting some modes in CPU and with Virtual VRam and that increased the resolution almost double and before it was at 97-98%.
@stylobcn yes! thanks for adding this. i have found this makes it easier to generate with consistent times, and it lets me go to higher resolutions or length without OOM errors.
Any idea why my output is extremely noisy and basically unusable when using this exact workflow?
Do you have an example to share? I could try to troubleshoot. Sounds like it could be a vae problem, but not sure. Are you on AMD or Nvidia?
Actually, I just tried it and am getting a noisy mess too. I ran an "update all" yesterday, so maybe something broke it.
I have the same issue. Using latest update of Comfy. It comes out looking like a badly corrupted MPG file.
with 12 Gb VRam it gives an error all the time, with the same configuration but even setting 9 frames it always fails, and trying with other models the same thing also happens:
got prompt
encoded latents shape torch.Size([1, 16, 1, 64, 48])
Loading text encoder model (clipL) from: D:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\models\clip\clip-vit-large-patch14
Text encoder to dtype: torch.float16
Loading tokenizer (clipL) from: D:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\models\clip\clip-vit-large-patch14
Loading text encoder model (llm) from: D:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\models\LLM\llava-llama-3-8b-text-encoder-tokenizer
Loading checkpoint shards: 100%|██████████████████████████ ██████████████████████████████| 4/4 [22:05<00:00, 331.43s/it]
Text encoder to dtype: torch.float16
!!! Exception during processing!!! Allocation on device
*
*
torch.OutOfMemoryError: Allocation on device
Got an OOM, unloading all loaded models.
Prompt executed in 1738.70 seconds. f I use a text2vid workflow there are no problems, but when using (down)Load HunyuanVideo TextEncoder it always gives a VRam error, I have even tried with --lowvram and other options and it always gives an error. Thank you very much for sharing with others and for your time, although unfortunately it doesn't work for me :(
This workflow is broken even for me right now, but not because of an out of memory issue. What size is the input image, after the resize node?
@samsismas I put the same thing as in the worflow that I use from txt2vid: 416x720
but I think I also tried it with 512x512 and the same thing happened to me.
Now I have had to reinstall comfyui portable because I tried another workflow that left comfyui not working and I will try again if it still gives me the error
@samsismas good morning. Now it works for me, I have been able to use it with 512x512 and also 480x640 (I have to try more resolutions, etc.), after installing the latest version of Comfyui and installing only the manager and the workflow nodes and it works.
Swapping 20 double blocks and 0 single blocks
Single input latent frame detected, LeapFusion img2vid enabled
Sampling 29 frames in 8 latents at 480x640 with 20 inference steps
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [04:58<00:00, 14.93s/it]
Allocated memory: memory=6.107 GB
Max allocated memory: max_memory=8.187 GB
Max reserved memory: max_reserved=9.375 GB
Decoding rows: 100%|█████████████████████████████████████████████████████████████████████| 7/7 [00:14<00:00, 2.09s/it]
Blending tiles: 100%|████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 11.10it/s]
Prompt executed in 6283.18 seconds
thank you so much
@samsismas If anyone has problems with missing VRam and getting an error, try the MultiGPU node using CPU and/or virtual VRam, using this I have been able to use more resolution and even so the VRam has dropped by 30-40% according to ComfyUi.