These models are the result of a partial NVFP4 quantization of Wan2.2-I2V-A14B-Moe-Distill-Lightx2v by lightx2v, produced using convert_to_quant by silveroxides. Some layers have been kept on their original BF16 format, while others were quantized as MXFP8 or NVFP4, mostly.
Wan2.2-I2V-A14B-Moe-Distill-Lightx2v is an image-to-video generation model built on Wan2.2-I2V-A14B. It applies step distillation and a MoE architecture to reduce inference to 4 steps without CFG, cutting generation time substantially while preserving output quality.
IMPORTANT
Since NVFP4 is only supported on NVIDIA Blackwell architecture GPUs, running this model requires a Blackwell GPU with its corresponding support enabled in torch, along with a recent version of ComfyUI and comfy-kitchen built against CUDA 13.
Description
FAQ
Comments (4)
ELI5: what is this good for?
Good for people with low resources with an entry-level 50xx series GPU. Since this model uses NVIDIA's MXFP8 and NVFP4 formats, inference is faster than other quants, such as GGUF.
My setup: 1 x 8-GB RTX 5060, 32 GB RAM. A 5-second video can be generated in about 90 seconds whith these models. I'd dare say that it's possible to run these models with 16 GB RAM and 8-GB RTX 50xx GPU, provided you have a quick NVME disk with enough swap.
Also, the quants were not made blindly converting all possible layers to NVFP4 (4-bit quantization), such as other NVFP4 models hosted on Civitai. An analysis was run on the original models first to decide which layers where to be preserved as BF16 and which ones to be quantized using MXFP8 (8 bits) and NVFP4.
Do you have any comparison videos? I'd test myself but I'm away from my PC until next week. Curious how it would compare
Unfortunately I lack the resources to perform comparisons with the original model. Are you referring to comparisons with other NVFP4 quants being published here on Civitai?
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.