Wan 2.2 Animate Model Download and Setup (ComfyUI)
V2 confirmed to work on 12gb Vram up to Q8 25frames per loop
V2 confirmed to work on 4gb vram up to Q4 25frames per loop
To use this workflow in ComfyUI, download the models listed below and place them in the specified folders.
Make sure folder names and file names match exactly as shown to prevent load errors.
Main Diffusion Model (GGUF)
Model: Wan2.2-Animate-14B-GGUF
Download:
https://huggingface.co/QuantStack/Wan2.2-Animate-14B-GGUF
Put it here:
ComfyUI/models/diffusion_models/
Note:
This model is quantized in GGUF format. Choose the version that fits your GPU VRAM:
Q4_K_M → about 10–12 GB VRAM (balanced)
Q5_K_S → about 14–16 GB VRAM (recommended for mid-range GPUs)
Q6_K → about 20 GB or more VRAM (highest quality)
LoRAs
lightx2v I2V (animation motion LoRA)
Download:
https://huggingface.co/Kijai/WanVideo_comfy/resolve/main/Lightx2v/lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors
Put it here:
ComfyUI/models/loras/
WanAnimate relight LoRA (lighting and realism enhancer)
Download:
https://huggingface.co/Kijai/WanVideo_comfy/resolve/main/LoRAs/Wan22_relight/WanAnimate_relight_lora_fp16.safetensors
Put it here:
ComfyUI/models/loras/
Text Encoder
umt5_xxl_fp8_e4m3fn_scaled.safetensors
Download:
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors
Put it here:
ComfyUI/models/text_encoders/
CLIP Vision Encoder
clip_vision_h.safetensors
Download:
https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors
Put it here:
ComfyUI/models/clip_visions/
VAE
wan_2.1_vae.safetensors
Download:
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors
Put it here:
ComfyUI/models/vae/
Required Custom Nodes
Install these custom nodes either through ComfyUI Manager or by cloning them manually into the folder:
ComfyUI/custom_nodes/
comfyui_controlnet_aux
https://github.com/Fannovel16/comfyui_controlnet_aux
ComfyUI-KJNodes
https://github.com/kijai/ComfyUI-KJNodes
ComfyUI-segment-anything-2
https://github.com/kijai/ComfyUI-segment-anything-2
IAMCCS-nodes (only v1)
https://github.com/IAMCCS/IAMCCS-nodes
ComfyUI-VideoHelperSuite
https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite
Execution Inversion Demo (looping mechanism)
https://github.com/BadCafeCode/execution-inversion-demo-comfyui
Quick Start
Load this workflow in ComfyUI.
Upload your reference image and input video.
Adjust the positive and negative prompts.
Make sure the green points and red points are set up properly in the detection subgraph
Make sure the width and height values are multiples of 16.
Run the workflow and your final animation will be saved automatically.
Description
FAQ
Comments (10)
Thx for the workflow, not really done much video as I only have 12 Gb vram but I got it working. Half an hour for a 3 second video tough. Also, how do I remove the input image from the video? It always show the input image first and then fades to the video
Hey this was an issue that just disappeared for me after a few updates. I have a brand new update coming though and it doesent have the issue at all.
You've shoved one of the most important settings into a subgraph. That's not what subgraphs are for, dude. 🤦
Could you explain that to me and how i can make it better? (noob myself)
Hey yeah I do realize that working on a update right now with With a looping mechanism for infinite length generation and I have put the points editor out of the detection subgraph if that's what you meant.
I get a VHS failure to extract audio error. I have updated the nodes to the latest and I'm still getting errors. Does anyone have any ideas?
VHS failed to extract audio from H:\Stable_Diffusion-STABILITY\Packages\ComfyUI\input\wan22_00048.mp4: ffmpeg version 7.1-essentials_build-www.gyan.dev Copyright (c) 2000-2024 the FFmpeg developers built with gcc 14.2.0 (Rev1, Built by MSYS2 project) configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband libavutil 59. 39.100 / 59. 39.100 libavcodec 61. 19.100 / 61. 19.100 libavformat 61. 7.100 / 61. 7.100 libavdevice 61. 3.100 / 61. 3.100 libavfilter 10. 4.100 / 10. 4.100 libswscale 8. 3.100 / 8. 3.100 libswresample 5. 3.100 / 5. 3.100 libpostproc 58. 3.100 / 58. 3.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'H:\Stable_Diffusion-STABILITY\Packages\ComfyUI\input\wan22_00048.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 creation_time : 2025-09-16T21:27:27.000000Z encoder : Lavf61.7.100 comment : {"prompt": "{\"6\": {\"inputs\": {\"text\": \"A woman wearing jeans and a shirt is standing, she starts to energetically dance, she moves her hips side to side ans swings her arms around,\", \"clip\": [\"38\", 0]}, \"class_type\": \"CLIPTextEncode\", \"_m Duration: 00:00:06.71, start: 0.000000, bitrate: 2307 kb/s Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 480x832, 2278 kb/s, 24 fps, 24 tbr, 12288 tbn (default) Metadata: creation_time : 2025-09-16T21:27:27.000000Z handler_name : VideoHandler vendor_id : [0][0][0][0] encoder : Lavc61.19.100 libx264 Output #0, f32le, to 'pipe:': [out#0/f32le @ 0000023cae89dc80] Output file does not contain any stream Error opening output file -. Error opening output files: Invalid argument
probably your video doesnt have audio, disconnect audio links and give it a try
@suraneither I did that first off. but I found another workflow that worked well with less problems. I will come back to this workflow just to see if I can get it working and compare. thanks for the reply.
@suraneither this
what loras are compatible with wan animate? like nsfw for example
