A workflow I've made that generates a nice image with txt2image and then uses that image to make a video with Wan2.2 TI2V 5B Q4.
ideally this is to generate an image with max quality and then use that image to generate a video faster. The image quality does drop on the video but that's to be expected. You could try using Wan2.2 TI2V regular model for better quality but it requires more steps and a little more VRAM.
Created the compilation video with this workflow. Nodes can be found in ComfyUI manager.
https://github.com/Isi-dev/ComfyUI_Animation_Nodes_and_Workflows/tree/main/animationWorkflows/joinVideos
Here is the link to the quant versions of WAN2.2 TI2V
https://huggingface.co/QuantStack/Wan2.2-TI2V-5B-GGUF/tree/main
Fast Lora
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/FastWan/Wan2_2_5B_FastWanFullAttn_lora_rank_128_bf16.safetensors
All the checkpoints and lora's used in this workflow were found in CivitAi, Huggin Face, or ComfyUI provided links to them.
Description
Tighten up workflow and added some additional nodes for better quality
FAQ
Comments (3)
I am loving this work flow. Text to Img and then a video in minutes on my 8Gb VRAM. Any chance of an image to video so we can input premade images?
You can do that by deleting the entire part of the TXT2IMG and where the VAE Decode node links to the TI2V workflow portion you just add a Load image node and connect it where the Vae decode node was linked to.
I somehow missed that a 5b version existed. All the other "8gb friendly workflows" i tried were 14b which wasn't feasible. The compromises made it not worth it.
This workflow works great on my 6600xt 8gbvram. Thankyou!!
I did have a question/feedback. VAE decode is by far the heaviest part of this workflow for me.
Your tiled decode had super high default values. Reducing these to 256, 64, 64, 8 MASSIVELY reduced the decode time with a minimal, if any, loss of quality. I have no idea what I am doing, just thought I would mention it in case there was a particular reason for it?