Ernie in different quants:
Mixed FP8 - Mostly fp8_e4m3 some are not quantized, fast.
Mixed NVFP4 - NVFP4 except final layers to give a higher quality finish, faster than FP8
NVFP4 - Mostly NVFP4 - Fastest
Note: You will only see speedups from NVFP4 on Blackwell series NVIDIA cards.
https://ernie.baidu.com/blog/posts/ernie-image/
text_encoders
vae
Model Storage Location
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── ernie-image-turbo-nvfp4.safetensors
│ ├── 📂 text_encoders/
│ │ ├── ministral-3-3b.safetensors
│ │ └── ernie-image-prompt-enhancer.safetensors
│ └── 📂 vae/
│ └── flux2-vae.safetensorsDescription
Has the more important layers as FP8 the rest as NVFP4 to give a good mix of the two




