喜欢中文的你看这边:英文看完后就是中文版
WAN 2.2 5b WhiteRabbit Interp-Loop
This ready-to-run ComfyUI workflow turns one image into a short looping video with WAN 2.2 5b. Then, it cleans the loop seam so it feels natural. Optionally, you can also boost the frame rate and upscale with ESRGAN.
In other words, this is an Image to Video workflow that creates loops with WAN 2.2 5b!
Why is this so complicated?!
WAN 2.2 5b does not fully support injecting frames after the first. If you try to inject a last frame, it will create a looping animation but the last 4 frames will be "dirty" with a strange "flash" at the end of the loop.
This workflow leverages custom nodes I designed to overcome this limitation. We trim out the dirty frames and then interpolate over the seam.
Model Setup (WAN 2.2 5b)
Install these into the usual ComfyUI folders. FP16 = best quality. FP8 = faster and lighter, with some trade-offs.
Diffusion model → models/diffusion_models/
- FP16: wan2.2_ti2v_5B_fp16.safetensors
- FP8: Wan2_2-TI2V-5B_fp8_e5m2_scaled_KJ.safetensors
Text encoder → models/text_encoders/
- FP16: umt5_xxl_fp16.safetensors
- FP8: umt5_xxl_fp8_e4m3fn_scaled.safetensors
VAE → models/vae/
- wan2.2_vae.safetensors
Optional LoRA → models/lora/
- Recommended: Live Wallpaper Style
Tip: keep subfolders like models/vae/wan2.2/ so your growing collection stays tidy.
How It Works
- Seam prep: we take the very last and first frames and generate new in-betweens that bridge them smoothly. Only those new frames get appended — no duplicate of frame 1.
- Full-clip interpolation (optional): multiply in-betweens across the whole video, then resample to any FPS you want.
- Upscale (optional): add an upscaler pass before full-clip interpolation using an ESRGAN model of your choice.
- Output: saved to your ComfyUI/output/ folder, filename prefix LoopVid.
Controls You’ll Care About
Defaults are set for “safe on most GPUs.” Tweak if you have more VRAM.
Full-Clip Interpolation
- Roll & Multiply: add more in-betweens everywhere (e.g., ×3).
- Reample Framerate: convert to an exact FPS (e.g., 60). Great after Multiply, but you could use it on its own.
Other handy knobs
- Duration: WAN cost climbs past ~3s (2.2 is tuned up to ~5s).
- Working Size: long edge in pixels (shape comes from your input image).
- Steps: ~30 is WAN 2.2’s sweet spot.
- CFG: WAN default is 5, I have it bumped a little higher. Higher = more “prompt strength,” sometimes more motion.
- Schedule Shift: motion vs. stability. Higher = more motion.
- Upscale: choose model/target size; reduce tile/batch if you hit OOM.
You can find more detailed information on all these settings in the workflow itself.
Using Vision Models for Prompts (optional but handy)
If writing movement prompts feels daunting, you can use a vision model to get a great starting point. You have a few options.
Free Cloud Options
Google's Gemini or OpenAI's ChatGPT are free and will get the job done for most people.
- Upload your image and paste the prompt below.
- Copy the model’s description and paste it into this workflow’s Prompt field.
...however, these services are not exactly private and might censor lewd/NSFW requests. That's why you might prefer to explore the other two options.
Paid Cloud Options
There are many services that offer cloud model access which is a more reliable way to get uncensored access to models.
You could pay for credits on OpenRouter for example. Personally, I prefer Featherless because they charge a flat monthly fee which keeps my costs predictable, and they have a strict no log policy. If you decide you want to give them a try, you could always use my referral link which helps me out!
If you decide to go the API/Paid Cloud route, you might find my app, CloudInterrogator, useful. It's designed to make prompting cloud vision models as easy as possible and it's fully free and open source!
Local Inference Option
I know a lot of people on CivitAI are local-or-nothing types. For you, there is Ollama.
Here's the best guide I could find on setting it up. You will want to look at Google's Gemma-3 family of models and look at which size is appropriate for your card.
If you use Ollama, there's nothing stopping you from using CloudInterrogator as your access point since Ollama creates OpenAI compatible endpoints, or you could customize this workflow with Ollama nodes for ComfyUI. I don't recommend doing the latter unless you can set it to lock the prompt.
Many workflows for WAN build Gemma3/Ollama nodes into the workflow. I decided not to do that, because I think 99% of people are going to be well serviced by Gemini or ChatGPT.
Suggested prompt:
Analyze the content of this video frame and write a concise, single-paragraph description of your predictions around what movement takes place throughout the video sequence that follows.
Your description should include the details of the character and scene as a whole but only as they related to the movement that occurs in the scene. In addition, make note of the movements of the particles, blinking of the eyes if any, movement of the hair... this is a moment captured in time, and you are describing these few seconds encompassed by the image. Everything that can move, does move - even minute details of the scene.
Do not describe ‘pauses’. Don't minimize the motion with words like ‘slight’ or ‘subtle’. Do not use metaphorical language. Your description must be direct and decisive. Use simple, common language. Be specific, and describe how each detail in the scene moves, but do not be verbose; each word in your description must have purpose. Use the present tense, as if your predictions are coming true as you type them.
You will deliver one paragraph without any additional information and without any special characters that format this response, avoid ‘The image sequence depicts the character’ and describe what happens, without saying ‘the video ....’"You might also have good luck with the suggested prompt from AmazingSeek's workflow depending on the model you use or what you're looking for!
Tips & Troubleshooting
WAN Framerate: WAN 2.2 is 24 fps. On WAN 2.1, if you decide to try it, set fps to 12 instead. There is a slider for this near the model loader node. The workflow auto-calculates what to do with your framerate (for multiplication and resampling) based on this number.
Seam looks off? Try switching between Simple/Fancy seam interpolation; increase the auto-crop search in Fancy; or re-render with a slightly different prompt/CFG.
Out-of-memory (OOM)?
- Lower tile size (x and y) in the WanVideo Decode node.
- Lower Upscale tile size and/or batch size.
- Reduce Working Size or Duration.
- Enable “Use Tiled Encoder”.
AttributeError: type object 'CompiledKernel' has no attribute 'launch_enter_hook'
I'm not sure what causes this, though my assumption is it has something to do with the WAN Video Nodes. This should fix it for you:
1. Open "🧩 Manager"
2. Click "Install PIP Packages"
3. Install these two, leave the quotes out: "SageAttention", "Triton-Windows".
3.1 Obviously Triton-Windows is only for Windows users. If you get this error on Linux, I would guess the package for Triton is just "Triton".
If this doesn't fix it for you, it may be that your ComfyUI Python environment is messed up for some reason or the version of Comfy you're using doesn't work with the Manager "Install PIP Packages" module. In that case, you might find this advice from the comments section helpful:
From alex223:
"i spent almost a day, but made it work. this thing helped, but also, for some reason my embedded python missed include and libs folder, I copied them from standalone version - that was essential for triton to work. Maybe my comment will help someone."
If you're still having problems, you can leave a comment. I don't mind trying to help people troubleshoot but I don't think the issue is with my workflow or with WhiteRabbit (my custom nodes).
Acknowledgements
- It occurred to me that interpolating over a loop seam might be a good solution to the "dirty frames" problem when I was first experimenting, but it was this workflow by AmazingSeek that really made me decide to go for it.
- It appears that Ekafalain should get some credit here, too, for their seamless loop workflow on which AmazingSeek's is based.
- While I didn't end up using any of their ideas directly, I want to shout out Caravel for their excellent, multi-step process you can have a look at over here that seems to primarily target WAN 2.2 14b. The level of documentation in this workflow alone is laudable.
- My recommended vision prompt is built off of NRDX'. You can find the original workflow it's from over on his patreon. This is the guy who is training LiveWallpaper LoRA for various WAN models, too!
P.S. 💖
If this workflow helps you, I’d love to see what you make! I put a lot of hard work into making it, including designing custom nodes to bring it all together and trying to document as much as possible so it is maximally useful to you.
Links
- Have a look at the WhiteRabbit repository for node documentation and atomic workflows if you want a better idea of how to build with the custom nodes here or tweak this workflow.
- My Website & Socials: See my art, poetry, and other dev updates at artificialsweetener.ai
- Buy Me a Coffee: You can help fuel more projects like this at my Ko-fi page
This workflow is dedicated to my beloved Cubby 🥰
- Find her artwork all over the internet
- She has many excellent LoRA on CivitAI for you to explore :3
WAN 2.2 5b WhiteRabbit 插值循环
这个开箱即用的 ComfyUI 工作流可将一张图片转换为使用 WAN 2.2 5b 生成的短循环视频。随后,它会清理循环衔接处的“接缝”,让过渡更自然。可选地,你还可以提升帧率并用 ESRGAN 进行放大。
换句话说,这是一个利用 WAN 2.2 5b 生成循环效果的“图像转视频”工作流!
为什么会这么复杂?!
WAN 2.2 5b 并不完全支持在首帧之后继续注入帧。如果你尝试注入最后一帧,它虽会生成循环动画,但最后 4 帧会出现“脏帧”,在循环结束处出现奇怪的“闪烁”。
此工作流通过我设计的自定义节点来规避这一限制。我们先裁掉脏帧,然后对接缝进行插帧插值。工作流内同时提供了“简单版”和“进阶版”的裁剪/插值流程,并配有切换开关,便于你分别试用。
模型设置(WAN 2.2 5b)
按常规 ComfyUI 目录安装这些文件。FP16 = 质量最佳;FP8 = 更快更省显存,但有一定取舍。
扩散模型 → models/diffusion_models/
文本编码器 → models/text_encoders/
VAE → models/vae/
可选 LoRA → models/lora/
提示:使用诸如 models/vae/wan2.2/ 这类子文件夹,便于管理不断增长的模型集合。
工作原理
接缝准备:取最后一帧与第一帧,生成新的过渡中间帧以实现平滑衔接。只会追加这些新帧——不会重复追加第 1 帧。
全片插值(可选):在整段视频中增加倍数级的中间帧,然后重采样到任意 FPS。
放大(可选):在全片插值之前加入一次放大流程,使用你选择的 ESRGAN 模型。
输出:保存到你的 ComfyUI/output/ 文件夹,文件名前缀为 LoopVid。
你会关心的控制项
默认设置为“对多数 GPU 安全”。如果你显存更充裕,可以适当调高。
全片插值
滚动倍增 ("Roll & Multiply"):在全片范围增加更多中间帧(例如 ×3)。
重采样帧率 ("Resample Framerate"):转换到精确的 FPS(例如 60)。在倍增后使用效果更佳,但也可单独使用。
其他实用旋钮
时长 ("Duration"):超过 ~3 秒成本上涨(2.2 调校到 ~5 秒)。
工作尺寸 ("Working Size"):以长边像素为准(纵横比来自输入图)。
步数 ("Steps"):~30 是 WAN 2.2 的甜点区。
CFG:WAN 默认 5,这里略微上调。数值越高=“提示强度”更高,有时也会带来更多运动。
日程偏移(Schedule Shift):运动 vs 稳定。数值越高=运动更强。
放大 ("Upscale"):选择模型/目标尺寸;如遇 OOM,降低 tile/batch。
关于这些设置的更多细节,可在工作流中直接查看。
使用视觉模型来生成提示(可选但好用)
如果编写“运动提示”让你犯难,可以借助视觉模型获得一个很好的起点。你有多种选择。
免费云端方案
Google 的 Gemini 或 OpenAI 的 ChatGPT 是免费的,对多数人来说足够用了。
上传你的图片并粘贴下方提示词。
复制模型给出的描述,将其粘贴到本工作流的 Prompt 字段。
……不过,这些服务的私密性并不理想,并且可能会审查低俗/NSFW 类请求。这也是你或许想尝试其他两种方案的原因。
付费云端方案
有很多服务提供云端模型访问,这是获取未审查模型的更可靠方式。
例如,你可以在 OpenRouter 购买点数。就我个人而言更偏好 Featherless,因为它按月固定收费、成本可预期,而且有严格的“无日志”政策。如果你想试试,也可以使用我的推荐链接来支持我!
如果你选择 API/付费云路线,我的应用 CloudInterrogator 可能会对你有用。它旨在尽可能简化云端视觉模型的提示流程,而且完全免费开源!
本地推理方案
我知道 CivitAI 上有不少“只用本地”的用户。你可以选择 Ollama。
这里有我能找到的最佳安装指南。你可以关注 Google 的 Gemma-3 模型家族,并选择与你显卡匹配的规模。
如果使用 Ollama,你完全可以把 CloudInterrogator 当作访问入口,因为 Ollama 提供 OpenAI 兼容的端点;或者你也可以为 ComfyUI 加上 Ollama 节点来定制本工作流。除非你能把提示锁定,否则我并不推荐后者。
许多 WAN 工作流会把 Gemma3/Ollama 节点直接内置进去。我选择不这样做,因为我认为 99% 的人用 Gemini 或 ChatGPT 就已经足够。
建议的提示词:
分析该视频帧的内容,用一个简洁的单段落描述你对随后的整段视频序列中将发生哪些运动的预测。
你的描述应覆盖角色与场景的整体细节,但只限于与场景中“运动”相关的部分。另外,请记录粒子的运动、如果有的话眼睛的眨动、头发的摆动……这是一个被时间定格的瞬间,你要描述的是这张图像所涵盖的这几秒内发生的事。凡是可能运动的,都在运动——包括场景中微小的细节。
不要描述“停顿”。不要用“轻微”“细微”这类词来弱化运动。不要使用隐喻性语言。你的描述必须直接而明确。使用简单、常用的语言。要具体,说明场景中每个细节是如何运动的,但不要冗长;你写下的每个词都要有用处。使用现在时,好像你的预测在你输入时正在成真。
你将输出一个段落,不包含任何额外信息,也不要使用会改变格式的特殊字符;避免用“图像序列描绘了角色……”之类的说法,直接描述发生了什么,不要说“视频……”。根据你所用的模型或目标,你也许会发现 AmazingSeek 工作流提供的提示词同样好用!
技巧与故障排查
WAN 帧率:WAN 2.2 为 24 fps。若尝试 WAN 2.1,请将 fps 设为 12。模型加载节点附近有对应滑块。工作流会基于该数值自动计算帧率相关流程(倍增与重采样)。
接缝看起来不对?试试在“简单/进阶”接缝插值之间切换;在进阶模式中增加自动裁剪搜索范围;或用略微不同的提示/CFG 重新渲染。
显存不足?
在 WanVideo Decode 节点降低 tile 尺寸(x 和 y)。
降低放大(Upscale)的 tile 尺寸和/或批大小。
减小工作尺寸或时长。
启用“Use Tiled Encoder”。
致谢
最初试验时,我想到在循环接缝处做插值可能解决“脏帧”问题,但真正让我决定上手的是 AmazingSeek 的这个工作流。
看起来 Ekafalain 也应在此获得一些认可,AmazingSeek 的无缝循环工作流是基于其成果之上的。
虽然我最终没有直接采用他们的想法,但仍想致敬 Caravel——他们面向 WAN 2.2 14b 的多步流程非常出色,你可以在这里查看,文档水准就值得称赞。
我推荐的视觉提示是基于 NRDX 的版本改写而来。你可以在他 Patreon 上找到原始工作流。他也是为多种 WAN 模型训练 LiveWallpaper LoRA 的那位!
附言 💖
如果这个工作流对你有帮助,我很想看看你的作品!我为此投入了大量精力,包括设计自定义节点把一切串起来,并尽量详细地撰写文档,以便它对你尽可能有用。
链接
若想更好地了解如何用这些自定义节点搭建,或如何微调本工作流,请查看 WhiteRabbit 仓库中的节点文档与原子工作流。
个人网站与社交:在 artificialsweetener.ai 查看我的艺术、诗歌及开发动态
请我喝咖啡:在我的 Ko-fi 页面支持更多类似项目
本工作流献给我挚爱的 Cubby 🥰
Description
Initial Version. Set/Get nodes removed in production version to maximize compatibility with ComfyUI front-end versions.
FAQ
Comments (22)
I keep running into this error at the sampler node
Triton and sage attention installed
WanVideoSampler
CalledProcessError: Command '['D:\\DVI\\ComfyUI_windows_portable\\python_embeded\\Lib\\site-packages\\triton\\runtime\\tcc\\tcc.exe', 'C:\\Users\\Nazz\\AppData\\Local\\Temp\\tmpaj_09keg\\cuda_utils.c', '-O3', '-shared', '-Wno-psabi', '-o', 'C:\\Users\\Nazz\\AppData\\Local\\Temp\\tmpaj_09keg\\cuda_utils.cp313-win_amd64.pyd', '-fPIC', '-lcuda', '-lpython3', '-LD:\\DVI\\ComfyUI_windows_portable\\python_embeded\\Lib\\site-packages\\triton\\backends\\nvidia\\lib', '-LC:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v13.0\\lib\\x64', '-ID:\\DVI\\ComfyUI_windows_portable\\python_embeded\\Lib\\site-packages\\triton\\backends\\nvidia\\include', '-IC:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v13.0\\include', '-IC:\\Users\\Nazz\\AppData\\Local\\Temp\\tmpaj_09keg', '-ID:\\DVI\\ComfyUI_windows_portable\\python_embeded\\Include']' returned non-zero exit status 1. Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
What flavor of ComfyUI do you have? Embedded, manual python, or did you install it as an "app"?
@ArtificialSweetener_ same here, portable comfyui, embedded python
@alex223 Open Manager, click "Install Pip packages" in "Experimental" area. Assuming you're on windows, use it to install...
SageAttention
Triton-Windows
If this does not resolve your issue, your problem is still most likely caused by a misconfigured virtual environment (in your case, the embedded python environment). I can give more specific steps to absolutely fix it if doing this doesn't do it but the instructions will be a lot more complicated.
@ArtificialSweetener_ i spent almost a day, but made it work https://www.reddit.com/r/StableDiffusion/comments/1iyt7d7/automatic_installation_of_triton_and/ this thing helped, but also, for some reason my embedded python missed include and libs folder, I copied them from standalone version - that was essential for triton to work. Maybe my comment will help someone.
@alex223 I'm glad you were able to get it to work and also that I was right about it being a problem with missing Triton/SageAttention.
I don't recommend people use the embedded version of Comfy for this reason. I think it's better to just install it manually into a venv! It's just a lot harder to fix the embedded environment if something gets messed up.
This is really good and is giving excellent results. The only thiing I'm struggling with is changing to resolutions that are not divisible by 32, specifically 1280x720 gets rounded down to 1280x704.
I did it by having it pad to 736 and then cropping off the padding at the end.
@bigman11 That's a fine solution. WAN 2.2 5b requires 32 pixel steps so the workflow is configured to force it.
@bigman11 Where is this pad located at? I have the same issue where I wish to keep 720p
noob question: when switching over to "simple" interpolate i am getting this error:
RIFE_VFI_Opt VFI model RIFE Opt requires at least 2 frames to work with, only found 1. Please check the frame input using PreviewImage.
Not a noob question at all. This is my fault, I'm sorry.
This happens because for some reason the resized frame we use for color correction is the "image batch" being passed to RIFE VFI Multiply (RIFE_VFI_Opt is the name internally because it's my optimized version of the one from ComfyUI-Frame-Interpolation!).
What we want is for the image batch to be passed. What's probably happening is that when we set the Fancy mode as a passthrough, that single frame gets passed all the way through. That's an oversight on my part and something I need to fix. It wasn't a problem for me in testing because I am using Set/Get nodes in development and the "get" node that becomes the problematic hardline connection is disabled.
If you don't want for me to wait for a fix, the easiest thing to do...
1. Disconnect "Cut Dirty Frames" in the bypassed "Fancy Interpolate Over Seam" section. It's connected to "WanVideo Decode".
2. Disconnect "Color Match to Input Image" "image_ref" input. It's connected to "Scale Input to Size"
Let me know if simple style works better for you. My suspicion is that you will get cleaner loops but that they sometimes have a "ping-pong" effect at the end. That's the problem Fancy tries to solve.
@ArtificialSweetener_ ah yes that fixed it, thanks for the quick reply! Will report back with results!
@Pixel_Music_Ai I am working on releasing a version that is fixed for everyone as well as hopefully improve looping performance. Your feedback has been invaluable. Thank you.
@ArtificialSweetener_ <3 another thing I've been playing around with is the # of trim end frames, sometimes reducing it to 2-3 produced pretty good since not all 4 frames are always "dirty"
@Pixel_Music_Ai Yes. It is one of the things I'm looking at! The reason seems to be because the end frame is not always injected at the last frame. I have a few ideas on solving that so it's more predictable.
You are starting to get the hang of the knobs and dials in the workflow. You might prefer to make loops with 5b in two steps: Generate a video with WAN in one workflow, do a batch like you already do. Paste the second half of the workflow with all the interpolation stuff and load your favorite clip with the VHS Load node. You'll be able to rapidly iterate over different settings to find a perfect loop for each clip you gen.
I will probably release a workflow for doing this because it's much more reliable, but the allure of fully automatic calls to me still.
Thank you again for your reports and sorry the fix took so long to release!
@ArtificialSweetener_ thank -you- for creating the workflow! and that was fast i'll play around with the new version when I have a chance
I just downloaded all models, encoders and vae suggested. I did not play around with any settings yet and im getting this error:
WanVideoSampler
AttributeError: type object 'CompiledKernel' has no attribute 'launch_enter_hook' Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
It's a problem with the WAN Video Nodepack afaik but here's how you can probably fix it:
1. Open Manager
2. Click "Install Pip Package" (under expirimental area, lower left)
3. Enter these, no quotes... "SageAttention", "Triton-Windows".
@ArtificialSweetener_ I found the cause. I switched my GPU to 50 series. After reinstalling trition and sage attention everything is fine.
@RBoomer That makes sense. I was thinking WanVideoSampler should really include Triton and Sage in its requirements but the features are optional. I was thinking of making a node that just sits in workflows and forces a platform based installation of them because of these problems.
Glad you got it fixed