This is just quantized (Q8_0, Q5_0) versions of the model Fluxmania III that excels in artistic photography. All credit goes to the original authors, I'll be happy to delete this post once (if) official GGUF version(s) are published.
There is (obviously) some quality loss, so if your hardware allows it, use fp8/fp16 versions.
Q5_0: VAE and clip are included.
Q8_0: Base model only, use Flux VAE and clip.
Tested in SwarmUI (just set Architecture in the model metadata to Flux.1 DEV) and ComfyUI with 12GB VRAM. Let me know if I should add Q4_0 for 8GB cards.
Samples have been created using DPM++ 2M sampler and beta or normal scheduler.
Sorry in advance for any shortcomings, this is my first share here.
Description
VAE and Clip included.
FAQ
Comments (11)
Thanks !
Can you also make a Q8_0 version with just the model weights ?
I can run Q8_0 models, with 8GB VRAM, in ComfyUI
It would only be slightly larger, at about 12.7 GB, and quality will be closer to the full size version.
I second this. I only use q8 with GGUF models. It holds up better than most F8 and much closer to F16. No issues running Q8s on a mobile 3080.
I'm on it. Would you mind sharing the workflow?
It's done. I'm very happy with the results. It runs fine (for Flux ;) ) with 12GB VRAM.
@higegojira You're awesome. Thank you!
Hi, I don't intend to make GCUF versions of my models, I find that there is an appreciable loss of quality, so I limited myself to fp8 full version. So I appreciate your work which complements the range and which meets the demand of some who ask for the GCUF versions. Nice work.
I will put a link to your model on the Fluxmania page
Have you also tested Q8_0 versions of your model?
I have tested all GGUF versions of the Flux dev model.
https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
The Q8_0 version was much closer in quality to the full size model, in comparison with FP8, while being only slightly larger.
not yet
@Adel_AI I'm actually ashamed that I haven't asked you before publishing. I was so happy that it works and that the outputs are usable that I jumped on impulse. I'm sorry about that.
Loss of quality is real indeed, but renders still seem to outshine almost all the merges and finetunes I've tested.
If anything you should do Q8_0 instead of FP8. The Q8 quant is better quality, and less than 1gb bigger.
非常好!喜欢!






