Update your ComfyUI to the latest (Github version) => cd to ComfyUI directory -> terminal -> git pull -> restart ComfyUI
The upscale workflow file contains Wan FlashVSR, which is the fastest but resource-heavy at 10GB VRAM. The Hunyuan SR workflow is time-consuming but VRAM-friendly
Lightx2v 4steps LoRA: https://civarchive.com/models/2162543
Prompt guide: https://civarchive.com/articles/22889/hunyuan-15-sudio-prompt-generator-and-guide
CFG 1 Steps 30-50 [ with no 4steps LoRA ]
Workflows in the zip files (i2v & t2v) + download links (text encoders, vae, clip vision, upscalers)
Type: Lightweight, open-source video generation model (Diffusion Transformer, 8.3B parameters)
Capabilities: High-quality text-to-video (T2V) and image-to-video (I2V) synthesis
Efficiency Features: Selective and Sliding Tile Attention for faster inference on consumer GPUs
Additional Support: Bilingual prompts, integrated super-resolution to 1080p
Performance: State-of-the-art visual quality and motion coherence among open-source models
These models are redistributed here for the sake of convenience.
Description
Super Resolution - Latent Upsampler 720p Distilled Model
FAQ
Comments (9)
How do I use it? I tried several workflows, and I get "tensor size mismatch" there when I try to use it as the drop-in replacement for the fp16 models
check the vae, something is not compatible with the rest
https://pastebin.com/CdnasgSc
you can use my workflow. im using with lighx2v, there's links there except the lightx2v lora (you can find it on civit)
@vAnN47 https://imgur.com/a/z53qObB
Well, I tried it, and out of the box, it worked for me. however, I see you’re using both the Resize Image and Clip Vision nodes. you don’t need to use both, just use Clip Vision. the Resize Image node can be problematic, mainly due to keeping proportions. The most bug-free mode is crop, as the others sometimes lead to sizes that the model doesn't support. I removed the resize step and just used Load Image, which worked perfectly. Check to see if that's the reason you’re getting that error
@vAnN47 oh sorry I thought I was responding to batart's reply. my bad.
@vAnN47 It uses fp16 model. Would not fit in my16gb vram
@batart who told u that? with 16GB vram u can run full wan2.2 in fp16 and if u have 64gb ram add full fp16 text encoder to it too. ur gpu, is it Ti?
@sweetmax797 rtx4070 ti super, 16gb vram. And 32gb ram.
@sweetmax797 @vAnN47
Thanks, it worked. Even with the FP8 diffusion model. The problem was with the (first) text encoder model. Once I have set everything (except the diffusion model) exactly as it was in @vAnN47 's workflow, it started working.
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.