π Wan2.1-VACE-14B (LoRA Accelerated): 10x Speed with CausVid LoRA for 3-Step Video Creation
π¬ Skyrocket Your Video Creation: Achieve ~10x Speed with Wan2.1 & the CausVid LoRA! π¬
π Overview
The Wan2.1-VACE-14B video diffusion model, when supercharged by the CausVid LoRA, is designed for high-quality, highly efficient video generation. It particularly excels at 480p and 720p resolutions through a streamlined 3-step ComfyUI workflow. This guide will walk you through the setup process to unlock this accelerated video generation capability, including options for full precision and quantized models like the fast Q3KL GGUF.
π Key Components
Diffusion Model (14B):
Full Precision: wan2.1_vace_14B_fp16.safetensors (Recommended for compatibility with LoRA examples)
Quantized (Civitai): wan2.1_vace_14B_Q4KM.safetensors
Quantized (GGUF - Civitai): wan2.1_vace_14B_Q3kl.gguf (Used in the 5-min example, requires GGUF loader)
This isn't the same as the GGUF format from Hugging Face (they missing !). I tested that one, and it didn't work for vid2vid tasks. So, I developed my own types specifically designed to work well with vid2vid. These are optimized and structured differently to ensure compatibility and better results , if u need another type do "a comment" after test hugging face one!
Performance LoRA (Essential for Speed):
VAE:
Text Encoder: Choose one:
umt5_xxl_fp16.safetensors (Recommended to match Kijai's wrapper compatibility for the LoRA workflow)
umt5_xxl_fp8_e4m3fn_scaled.safetensors (Smaller, fp8 version)
π File Organization
Place the downloaded files in the following structure within your ComfyUI directory:
ComfyUI/
βββ models/
β βββ diffusion_models/
β β βββ wan2.1_vace_14B_fp16.safetensors # Or Q4KM.safetensors, or Q3kl.gguf
β βββ text_encoders/
β β βββ umt5_xxl_fp16.safetensors # Or the fp8 version
β βββ loras/
β β βββ Wan21_CausVid_14B_T2V_lora_rank32.safetensors
β βββ vae/
β βββ wan_2.1_vae.safetensors
π¨ Model Showcase: Rapid 720p Cinematic Shots
This setup, featuring Wan2.1-VACE-14B and the CausVid LoRA, excels at producing 720p (and 480p) video clips with remarkable speed, even faster with quantized GGUF models. It's ideal for quick iterations, creative experimentation, and efficient content creation, all streamlined by a 3-step workflow.
π‘ Usage Tips
Model & LoRA Configuration: For maximum speed and quality, ensure you are using the appropriate 14B model (e.g.,
wan2.1_vace_14B_fp16.safetensorsorwan2.1_vace_14B_Q3kl.gguf) paired with theWan21_CausVid_14B_T2V_lora_rank32.safetensorsLoRA. The LoRA should be applied with a strength typically around 1.0.Text Encoder: The
umt5_xxl_fp16.safetensorstext encoder is recommended for best compatibility with existing examples and Kijai's original demonstrations. The fp8 version can save VRAM.Resolution: This setup is optimized for 480p and 720p video generation.
Performance Gains:
Without LoRA (fp16): An 81-frame 720p video might take ~40 minutes on an RTX 4090.
With CausVid LoRA (fp16): The same video can be generated in ~4 minutes on an RTX 4090.
With CausVid LoRA & Q3KL GGUF: Potentially even faster, around 5 minutes or less for similar output on capable hardware with a GGUF loader.
Workflow Simplicity: The primary advantage, beyond speed, is the reduction to a 3-step generation process once models are loaded. This typically involves: 1. Prompting (Text Input), 2. KSampler (or equivalent node with LoRA and chosen model), 3. Video Combine (Output).
π Credits & Acknowledgements
Original Wan 2.1 models repackaged for ComfyUI by Comfy-Org: Wan 2.1 ComfyUI Repackaged on Hugging Face. The performance-boosting CausVid LoRA (Wan21_CausVid_14B_T2V_lora_rank32.safetensors) was extracted and shared by Kijai. Original announcement and details: Kijai's Reddit Post. Quantized GGUF and Safetensors versions available on Civitai, enabling broader accessibility and speed. Gratitude to the developers of the underlying CausVid technique (presumably available under an MIT License or similar open terms).
π¨βπ» Developer Information
This guide was created by Abdallah Al-Swaiti:
For additional tools and updates, check out my other repositories.
β¨ Create Dreamy Videos with WAN 2.1 VACE and Pastel Dream! β¨
Description
FAQ
Comments (11)
IF U FACE CUDAGRAPH ERROR START COMFYUI with this > python main.py --preview-method auto --force-fp16 --dont-upcast-attention --use-sage-attention --disable-cuda-malloc (i prefer inductor with this command python main.py --preview-method auto --force-fp16 --dont-upcast-attention --use-sage-attention)
So people know, the new AccVideo Wan model (And presumably the Lora too) I think makes higher quality outputs than CausVid, speeding things up quite a bit. Swap out the CausVid Lora for the AccVideo Lora, make sure the CFG is 1, and set the steps to between 10 and 20. 10 is recommended, but I personally got better results adding more. The thing is, the steps take less time each, so 20 steps does not take as much time as 20 steps with either standard Wan, or CausVid.
Kijai has made the Lora and model available here:
https://huggingface.co/Kijai/WanVideo_comfy/tree/main
did u tried workflow ?
Sounds great... But I get this error: shape '[81, 120, 8, 96, 8]' is invalid for input of size 59781888
try another video or resize it (put values at upscale node (768 for bigger side ))
Oh no, what did you do the Bee Gees?
if they have this effect they will use it ! what do u think ( https://www.youtube.com/watch?v=4V90AmXnguw)
@AbdallahAlswa80Β oh yea, of course I was only joking... but good video
API Error: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [violations {
quota_metric:
Oh no!!! Google doesn't like me
try another gemini flash , its free
For a 50080 video card. What resolution would you recommend using?