This workflow is based on https://github.com/kijai/ComfyUI-WanVideoWrapper
You can find all the required models on that GitHub page.
Additional LoRA for Lightning version.
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors
A fun image-to-video model that's a bit different.
There are notes in the workflow to help guide you on how the workflow works,
NOTE: There is a bug with the newest version of Plush, for the prompt enhancer. Some people had luck with uninstalling and reinstalling the custom node. For others, they had to downgrade the extension to an older version. https://github.com/glibsonoran/Plush-for-ComfyUI/tree/cb3c4777b54fc212770b2d91901e3a85d04e12d6
Updated workflow video.
Otherwise, please check out my tutorial video to help use the workflow.
Depending on the resolution and frames, the workflow will work for GPUs with 16GB of VRAM or less. You can also increase the block swap to put more of the model into system RAM instead of VRAM.
Description
For best results, use kijai's checkpoints for this workflow.
Other quantized versions will work but may produce unexpected results.
FAQ
Comments (10)
Has anyone had good results with this on drawn/2d images? most of the video stuff i see is that sort of 2.5d/realism style and I feel like it doesn't handle animating 2d as well (but maybe i'm missing some settings/prompt stuff)
Also curious if there's a way to reduce/eliminate mouth movement on a character, since it seems to really want to animate mouth movement even when 'talking' and similar things are in the negative prompt.
Yeah, I see this problem too. So far I haven't seen any anime style of illustration style animations that looked decent. There is a new hunyuan image to video model that's promising, but it has a lot of problems right now.
The problem with the "talking" seems to be related to the training data for these video models. I tried prompts like "moving mouth" etc but doesn't seem to help
Is there any way to control timeline in I2V Wan? i mean prompt scheduling? if yes, how? frustating the the AI must do all the storytelling
As far as I know, it doesn't work very well since WAN produces a video all at the same time, instead of one frame after another like animate diff does.
Hmm.. But since there is movement there must necessarily be some sort of timeline ... ?
@EmeraldApple I've just got my PC with 3090TI crashed again on those Kijaj nodes ๐ with just 45 frames (with gguf works fine up to 120 frames)
Which torch, sage, triton version would you recommend for safe runs?
I am interested in how you made the character turn around. What is a really good prompt for this? thanks!
DownloadAndLoadFlorence2Model
Unrecognized configuration class <class 'transformers_modules.Florence-2-large.configuration_florence2.Florence2LanguageConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of AriaTextConfig, ...