WAN 2.2 i2v ComfyUI Workflow + SVI Extend (FLF2V) + Upscale

WAN 2.2 i2v ComfyUI Workflow + SVI Extend (FLF2V) + Upscale - mmaudio

NSFW

SVI Extend

https://github.com/vita-epfl/Stable-Video-Infinity/tree/svi_wan22

Create videos and extend them seemlessly using SVI.

Following SVI LoRAs are mandatory:

HIGH: https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Stable-Video-Infinity/v2.0/SVI_v2_PRO_Wan2.2-I2V-A14B_HIGH_lora_rank_128_fp16.safetensors

LOW: https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Stable-Video-Infinity/v2.0/SVI_v2_PRO_Wan2.2-I2V-A14B_LOW_lora_rank_128_fp16.safetensors

switch between default behaviour, anchor_samples and end_frames within the same subgraphs
connect an image to a part and enable the respective toggles to use end_frames or anchor_samples

NEW! v3

Extend existing videos using https://github.com/wallen0322/ComfyUI-Wan22FMLF

enable "video extension" toggle inside the settings
uses source video resolution by default
- rescale video using the megapixel slider by enabling "video rescale" toggle
use included version of the nodes from inside .zip or download the latest version straight from the git if issues arise

More info inside the workflow.

AIO i2v+t2v

All in One workflow for for basic WAN 2.2 video generation.

Following features included:

Switch seemlessly between 2 and 3 sampler solutions
Toggle between i2v or t2v
Postprod
- Facedetailer
  - uses t2v Model + LoRA for inpainting - resources needed included in workflow
- Toggle between GIMM VFI and RIFE VFI Interpolation
- Upscale
  - Tensorrt Upscale with Model
  - Basic Video Upscale with Model
  - RTX Video Super Resolution Upscale (insanely fast for decent quality)
- Frame Clipper
- Seamless Loops using custom RIFE nodes https://github.com/Artificial-Sweetener/comfyui-WhiteRabbit

Upscale + Interpolate

I recommend using this workflow instead of upscaling with the generating workflows, since you never really know what kind of results you get, ending up upscaling a bad video and wasting time. I included toggles so you can't use multiple interpolation or upscale nodes at once by mistake.

This includes:

WAN Facedetailer
- use any WAN 2.2 T2V low model + the following T2V LoRA:
  - https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Lightx2v/lightx2v_T2V_14B_cfg_step_distill_v2_lora_rank128_bf16.safetensors
- lower resolution from 768 to 512 if you have VRAM issues
- Put the following file into "ComfyUI\models\ultralytics\bbox":
  - https://huggingface.co/Anzhc/Anzhcs_YOLOs/blob/main/Anzhc%20Face%20seg%20640%20v3%20y11n.pt
WAN Refiner (massive VRAM cost)
- increase denoise if you want more inpainting
Sharpen, Gamma, Brightness and Contrast control
Frame clipper (remove unwanted frames at the start and/or end)
GIMM VFI + RIFE VFI interpolation (I recommend GIMM VFI, much higher quality but also much slower)
- see the following for a massive RIFE speedup
  - https://github.com/Fannovel16/ComfyUI-Frame-Interpolation/pull/102
  - https://github.com/Fannovel16/ComfyUI-Frame-Interpolation/blob/fa2f594ff0ecd4373b07aad87f1f685a0b8a771e/vfi_models/rife/__init__.py
Tensorrt Upscale + Basic Video Upscale
- both use basic image upscaling models
- Tensorrt (faster than Basic Video Upscale) with AnimeSharp4x is recommended for anime
RTX Video Super Resolution Upscale
- insanely fast
- decent quality
FlashSVR + SeedVR2
- experimental
- Video upscale models that are more intricate than basic image upscaling models
- haven't had great results for anime yet
- takes a LOT longer
Saving last frame for manual extensions

mmaudio

added Audio combine node
- combine audio from an existing video with the generated audio on top
- generate nsfw audio with the nsfw model and then combine that video with another generated audio track from the base model for background noises
removed interpolation for easier and faster audio generation - you have the following options:
- upload raw unupscaled video to MMAudio Video node and upscaled video to Combine video node
- upload upscaled video to both nodes but lower custom_width and custom_height of the MMAudio video node to about half for faster generation and to prevent VRAM issues
- upload raw video to both nodes and upscale afterwards

Inspired by https://civarchive.com/models/2137833

Following resources necessary (ComfyUI\models\mmaudio):

https://huggingface.co/phazei/NSFW_MMaudio/resolve/main/mmaudio_large_44k_nsfw_gold_8.5k_final_fp16.safetensors?download=true

https://huggingface.co/Kijai/MMAudio_safetensors/resolve/5984623e6b436818c6ff287ef6eec93e3e05aa3f/mmaudio_vae_44k_fp16.safetensors

https://huggingface.co/Kijai/MMAudio_safetensors/resolve/main/mmaudio_synchformer_fp16.safetensors

https://huggingface.co/Kijai/MMAudio_safetensors/resolve/5984623e6b436818c6ff287ef6eec93e3e05aa3f/mmaudio_vae_44k_fp16.safetensors