With the native comfy implementation of Hunyuan I have tweaked the workflow to work for 12GB VRAM cards. It looks like you can get at least to 73 frames and probably a bit more. It takes about 8 minutes for a 4070Ti to run 20 steps.
Do make sure to update your comfy and to get the exact result above the guidance I put up to 10 (I reduced it for the base workflow as it caused some burning for some prompts).
As its not as complete as the wrapper node there is a few less features than that for now. But there is also no crazy special installations that you need to do.
Links to model downloads here:
https://comfyanonymous.github.io/ComfyUI_examples/hunyuan_video/
Description
FAQ
Comments (44)
BEST !
Perfect timing for Project Odyssey! I was just about to dive into Hunyan. Also Congrats on 12GB reduced from 16GB from Kijai
I keep getting "EmptyHunyuanLatentVideo" is missing
You need to update Comfyui and all the custom nodes involved
@epsilon9 Ended up being related to the desktop version of ComfyUI. Using portable worked.
@epsilon9 eh question: how do I actually update comfyui (desktop app)? update all nodes in manager does nothing for me
@marcouscousaurelius It will be pushed there eventually - apparently its not on the desktop standalone yet.
install this node, it has the node you need in it -- ComfyUI-HunyuanVideoWrapper
still missing , though i installed that HunyuanVideowrapper ???
Using the portable standalone with Kijai's nodes installed, I also don't have this node, and Manager can't locate what's missing. Using the webui.
Fixed it. Manually download https://github.com/comfyanonymous/ComfyUI/blob/e4e1bff60532ea1a2e2550a1d9beb9b87bfd8c7c/nodes.py and put it in ComfyUI\custom_nodes\ComfyUI-HunyuanVideoWrapper and then https://github.com/comfyanonymous/ComfyUI/blob/e4e1bff60532ea1a2e2550a1d9beb9b87bfd8c7c/comfy_extras/nodes_hunyuan.py and put it in ComfyUI\comfy_extras.
@mwoody450 this fixed my issue too but now i'm getting a new problem, it says "ValueError: too many values to unpack (expected 4)" when it gets to the sampler node.
@mwoody450 thank you very much, this solved my problem too
Where is the download link for the new version of the model?
is this it?
tencent/HunyuanVideo at main
Is this for lower VRAM?
how can I add lora with this workflow
how to add lora to gguf?
Anyone able to get this working using AMD?
It's taking me about 3 hours to do a 3 second video on a 4070ti - what the hell am I doing wrong? I must be missing some optimisations or something
It would be awesome to see you make a new workflow with lora support!
Does anyone have loras working on a 3090? As soon as I add in a lora node to this workflow (or others...), my GPU memory usage shoots up to 27GB+ and ComfyUI just stalls out.
It takes an hour on my 4080 with the default settings and even longer if i set it to fp8_e4m3fn and i dont know why
I tried this (and other) workflows too on my 4080 and it would take ages to generate a video. What am I doing wrong? I just installed ComfyUi, downloaded everything this workflow needed and it just does not generate anything, not even in 10 hours...
My 4090 is working on all 24Gb of VRAM and it takes one hour to make something, while the pc barely could do anything else. What am I doing wrong? Than, why any prompt I try to make realistic, it generates just in anime style?
Is there hope for my 2060 6GB?
No matter what I try I can't get comfyui to recognize my text_encoders
Just generates pixelated noise, like TV static. Can't figure it out. Comfy is so complicated. 4th time attempting to learn this UI and there's just so many moving parts and one tiny little piece missing is all it takes. Impossible to tell what's gone wrong in order to fix.
I've tried so many workflows including this one, and it takes 4 hours and produces just noise. I have a RTX 3060 with 12gb Vram and 64gb system RAM so should be well within spec. Anyone else have this?
I am having trouble with the VAE loader. The start of the error message is "HyVideoVAELoader
Error(s) in loading state_dict for AutoencoderKLCausal3D:". It seems like the vae listed here is incompatible with the hunyanvaeloader, even though I downloaded and used the one mentioned in this tutorial. Any ideas why it can't load?
Very nice, your workflow in my 4070TIS, it only takes 4 minutes to render 120 seconds of video and the quality is very good, time is very much appreciated!
I downloaded the workflow.
I followed the instructions where to get the models. (seemingly pointing to 16 bit checkpoints)
OOM on cuda 0
I found instead an 8 bit model to correspond to comfyui --fp8_e4m3fn-unet
OOM on cuda 0
I replaced the nodes that load VAE and text models with the nodes from multiGPU to place the text and vae models on cuda 1,2,3 devices.
Still OOM on cuda 0
I have 4 GPUs (old, but they each have 12.2GiB VRAM)
Supposedly the workflow should work.
Why isn't it?
Any suggestions are welcome, which nodes, which specific checkpoints I should use. Perhaps the urls here are outdated?
Thanks.
Are there text encoding models that can handle longer prompts and work with Hunyuan Video model?
"Token indices sequence length is longer than the specified maximum sequence length for this model (192 > 77). Running this sequence through the model will result in indexing errors."
Thanks for this workflow, Ran very well on my RTX 3060, Generation times about 6 to 8 minutes with 2 Loras loaded, great work.
First time I finally manage to run a workflow O.o I've been trying quite a lot of i2v Hunyuan workflows todays and can't seem to make any of them work. If you could manage to put together an image to video workflow as efficient as this one, that'd be lovely!
Cheers. Thanks for sharing.
You are using the base bf16 to do this? the 24gb model?
how using lora with this ?
This workflow instantly filled my 24gb of VRAM and started swapping into system RAM, becoming unbearably slow.
I noticed that my base model is called hunyuan_video_720_cfgdistill_bf16 instead of hunyuan_video_t2v_720_bf16. Found download, both files seem to be the same, at least size-wise. Any idea why its still chugging so much memory?
Dude this runs amazing on my 4070 12gb and 32 gb system ram. Only took about 10 minutes and great quality.
Works very well on 5070ti, Only mod I made is added lora to workflow. Total time is 8.5 minutes with default settings on workflow.
Not working. On launch i am getting all 32Gb ram oaded and then launch interrupted.
It takes my RTX 3060 12gigs about 30minutes to generate 20 frames with this. Not sure how I can get it down lower.
There is much more you can do with WAN these days. I would look for one of the new workflows there.
LTXV