I compiled a little collection of flux.1 models. there are fp8 models with fp8 t5 and fp8 models with fp16 t5 for both dev and schnell. Single files for use with the regular checkpoint loader. there are also fp16 models available now. all models have clip, t5, and vae baked in. THESE ARE ALL STOCK FLUX.1
Forr flux kontext, see here https://civarchive.com/articles/16348/flux1-kontext-dev-quantized-models-available
`
These all use bf16 upcasting, use the appropriate flags if you are tuning on gtx cards for some reason.
`
Unified single file versions of flux.1 for comfyui. All files have a baked in VAE and clip L included:
flux.1_dev_8x8_e4m3fn-marduk191.safetensors is Flux.1 Dev quantized to 8 bit with an 8 bit T5 XXL encoder included.
flux.1_dev_fp8_fp16t5-marduk191.safetensors is Flux.1 Dev quantized to 8 bit with an 16 bit T5 XXL encoder included.
flux.1_schnell_8x8_e4m3fn-marduk191.safetensors is Flux.1 Schnell quantized to 8 bit with an 8 bit T5 XXL encoder included.
flux.1_schnell_fp8_fp16t5-marduk191.safetensors is Flux.1 Schnell quantized to 8 bit with an 16 bit T5 XXL encoder included.
flux.1_dev_16x16-marduk191.safetensors Flux.1 Dev quantized to 16 bit with an 16 bit T5 XXL encoder included.
flux.1_schnell_16x16-marduk191.safetensors Flux.1 Schnell quantized to 16 bit with an 16 bit T5 XXL encoder included.
flux.1_dev_8x8_scaled-marduk191.safetensors is Flux.1 Dev quantized to 8 bit scaled stochastic weights and normalized outlaying alphas. It uses an 8 bit scale dstochastic (tag limited to avoid loss) T5 XXL encoder included.
Workflow examples are available here: SOON
Repository is here: https://huggingface.co/marduk191/Flux.1_collection/tree/main
Discord: https://discord.gg/s3kj9VqpKc
Tips welcome: https://ko-fi.com/marduk191
Description
Flux.1 Dev quantized to 8 bit with an 16 bit T5 XXL encoder included.
FAQ
Comments (7)
thank you. t5xxl fp16 ftw.
btw do u do merges as well?
clip merges are completely lame and i prolly will not. it's a waste of time.
@marduk191 interesting. so merges will not get u "better quality" than schnell at 4 steps?
found this comment by this guy who does "salto" and "bitte" merges https://huggingface.co/silveroxides/flux1-nf4-weights/discussions/1#66b926b61caaa1d77c05b7d8 but no idea what "double blocks" or "guidance keys" means. so....
Thank
can someone help me understand what is the difference between the variances here. Maybe ranked by how much GPU strain if possible
model type: Schnell or Dev
Precision: fp8 or fp16
t5 precision: fp8 or fp16
for flux the pipeline is something like this: [text] -> t5 encoder -> [CLIP text] -> flux model ->[diffused latent space] -> AE decoder -> image
in general
- the higher precision dev/schell model, the better the image but more memory needed (and likely more time)
- the higher precision the t5 encoder the better it 'understands', or 'follows', your prompt
I cant confirm the naming scheme OP has used on the models posted here, I'm not sure myself tbh.
I'm not 100% on this stuff in general so I might be wrong with the above.
lowest GPU strain: Schnell 8x8
highest GPU strain: Dev 16x16
descriptions are below the model for understanding the straightforward naming scheme. I added a scaled version that i would recommend for anyone 10gb and over now days. gpu strain really has nothing to do with it. lower precision is lower quality but smaller file size. file size determines what amount of vram you need before it slows down because of offloading.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.
