I've created this Hunyuan Image2Video workflow based on Kijai's workflow and added an easy to use interface along with upscaling and frame interpolation.
Key Features:
1st pass - generating vanilla video
2nd pass upscaling using user selected upscaler and frame interpolation to 48FPS
This workflow comes pre loaded with my Hunyuan SkyReels RunPod template:
https://civarchive.com/articles/12253/runpod-template-hunyuan-image2video-with-workflows-included
Get the new Hunyuan I2V model from Kijai here:
Description
Modified the existing Native ComfyUI Workflow with a TeaCache sampler and added LoRA support
FAQ
Comments (49)
LFG finally the official Hunyuan img2vid. So excited for this!
I'm getting a lot of either fast and jerky movements or no movement at all. Does a second pass and/or interpolation fix this issue or do I need to change settings like cfg scale and flow shift? Thanks
No, the Hunyuan I2V model is just not good compared to WAN for now. I've seen multiple folks complain about the same issue. Prompt adherence is terrible.
@Melty1989 This workflow seems to do movement pretty well though, even on the first pass. Prompt adherence is abysmal yeah I agree.
@funscripter627 @Melty1989
I just released a new version with native ComfyUI support.
Seems to work better.
@Hearmeman Generation time is much slower for me with the native nodes because the model won't fit it my VRAM and that's with the FP8 model. (12Gb 4080, 32Gb Ram)
Movement seems more erratic too, but I have to test more.
@funscripter627 With 12GB of VRAM I'll wait for the GGUF models.
@Hearmeman Nah man, I'm generating 720x560x85 in 5 minutes which is good enough for me. Thank you for the workflow btw!
Any solution? getting this error
HyVideoI2VEncode
list index out of range
I think I had a similar error, it might be triggered by issues with the prompt length. Try shortening it maybe
@bhopping it was a really short prompt, i tried the comfy native workflor and it did work so i'm not sure what it is
@Hearmeman So, based on some loras I tested - not all of them play nicely with I2V. The Undress Lora worked fine (as you can see from my post), but others like Side anal sex - v1.0 | Hunyuan Video LoRA | Civitai , etc had no movement at all.
I guess we might have to wait for some better LoRA implementation.
I'm just future proofing my workflows :)
I haven't tried myself but this workflow mentions working with that LoRa specifically https://civitai.com/models/1197557/hunyuani2vnfswworkflow?modelVersionId=1499035
@Kiefstorm I did try it. It's not a very efficient workflow for my 3080 (10 GB). Eventually it did complete, but the animations were jank as hell.
@Kiefstorm I did try it but the workflow was quite slow on my 3080 (10GB). When it finally did complete, the animation was completely jank.
Any way to make this work for 16gb cards I use your default settings and the fp8 model, I only changed to generate 720x560x85 but even after an hour of waiting, still at 0/20, no error just doesn't load up
Did you use Kijai's model?
@Hearmeman Yes, the one in the description
@Engelbert_Klaus Something's defiantly wrong, With those settings I think it would only take 10-15 minutes, possibly less.
Try the Q8 ggufs working even on a 4070 12 GB quite fast. Adalov seems to like Q8 Quantisation. Check my post here for the nodes.
@sikasolutionsworldwide709 Thanks! That works perfectly, I am getting 20s/it but the result is extremely bad, if you can even call it a result, I will upload post here showing it.
Is WAN better for i2v? I heard good things about it but if it's only like 5% better, is it worth switching too? I'll prolly test it myself out of curiosity, also could be too soon to know yet
for me wan give exactly the same colors and face as original image.
HUN is bad. Hun give me not same colors, details and face
Wan is much better at interpreting complex motions and prompt adherence in i2v. It takes longer to infer but it's worth it since you'll usually get the footage you want.
@Melty1989 100%
stick with hunyuan bro, official image2videos quick and spectacular results. never a problem
so far in my experience, hunyuan is superior with t2v, fast, efficient, and great results, but when it comes to i2v, wan seems to be the better choice, yes slow compared to hunyuan, but the results are worth it, looks like I'll be using wan for I2V and Hunyuan for T2V
@mrreclusive3545 I have better results with WAN T2V over Hunyuan. As one example, Hunyuan just can't tell someone to take off their clothes with T2V no matter the prompt without a lora. WAN can do it.
Thx fort this workflow. I have changed the teacache sampler, using the 1 with speed options. Gentime on 4070 12GB, 32 Gb RAM with Q8 ggufs is 670.98 secs. Used a screencap from ur vid. Check post
Sounds good!
Thanks a lot for the buzz!
Please the video I posted that I generated using your workflow. I'm hoping if you can suggest what I did wrong to create that result.
@sarashinai Please send your settings
@Hearmeman I was hoping they were embedded in the video, let me know if they aren't. Also, that was using @sikasolutionsworldwide709 workflow derived from yours.
@sarashinai Ure using lava llama fp8 scaled.saftensors in the gguf clip loader u need to use a gguf in the gguf loader. https://huggingface.co/city96/llava-llama-3-8b-v1_1-imat-gguf/tree/main
@sikasolutionsworldwide709 You called it, works just like yours now, thank you for the help.
Well after few hours...I'm unable to make this work, I receive:
missing Node Types:
HunyuanImageToVideo
TextEncodeHunyuanVideo_ImageToVideo
Let's try a different workflow
Did you update your ComfyUI to the latest version?
@Hearmeman oh! well that was fast! hehe Now I'm on v0.3.24 and the message has gone, thanks a lot
if you add a color match block before the video combine and upscaling, the videos look less bright/airbrushed.
interesting,could u post a WF,thx
What method and strength do you suggest?
@sarashinai I'll need to test the fixed model to see if the issue is gone, but I was using mvgd at 1.0
You probably know but there are updated ("fixed") I2V HV weights you should probably link to (and test!)
Can you link it please?
I've been focusing on Wan
I'm using these: https://huggingface.co/Kijai/HunyuanVideo_comfy/blob/main/hunyuan_video_I2V_720_fixed_fp8_e4m3fn.safetensors (note the "fixed" in the name). TBH it doesn't seem much better to me. It is truer to the image in the first frame, but then just blows up. Hopefully someone will figure it out. For now, for I2V, Wan >> HV
@logenninefingers888 With the fixed version I can't even get the first frame to match and the overall results are worse. Would you mind sharing what you've been using for resolution/steps/etc?
@logenninefingers888 Thank you buddy, I'll test and update.
I updated the workflow.
Image is consistent to the reference image, results are not very good.
Waiting for people to upload some examples