This page contains scaled fp8 quantized DiT models of Neta Lumina for ComfyUI.
Neta Lumina (NT)
NetaYume Lumina (NTYM)
And a scaled fp8 quantized Gemma 2 2b (the text encoder).
All credit belongs to the original model author. License is the same as the original model.
Note: Images from bf16 and fp8 models are identical, like this. If image from fp8 model changed drastically, your ComfyUI somehow enabled fp16 mode. Lumina 2 doesn't not support fp16, and you will get deformed image.
Update (11/27/2025): mixed precision and fp8 tensor core support (mptc).
This is a new ComfyUI feature that supports fp8 tensor core, also with scaled fp8 + mixed precision.
In short:
Mixed precision: Keep important layers in BF16.
FP8 tensor core support: On supported GPU, much faster (30~80%) than BF16 and classic FP8 scaled models. Because ComfyUI will do calculations in FP8 directly, instead of dequantizing + BF16. torch.compile is recommended.
More info: https://civarchive.com/models/2172944/z-image-turbo-tensorcorefp8
Description
FAQ
Comments (2)
good, mptc works, estimate 60% faster than bf16 on rtx 4xxx cards. so, z-image next.
fyi this is literally the second mptc model, the first is flux.2 in the ComfyUI official hf repo
Thank you for your work on Lumina 2!
This might be of interest to you: Lumina-DiMOO, a multi-modal generation model from the same team.
The official stance is "we won't quantize it because it hurts quality". But, come on...
