[Note: Unzip the download to get the GGUF. Civit doesn't support it natively, hence this workaround]
A merge of Flux.D with the 8-step HyperSD LoRA from ByteDance - turned into GGUF. As a result, you get an ultra-memory efficient and fast DEV (CFG sensitive) model that generates fully denoised images with just 8 steps while consuming ~6.2 GB VRAM (for the Q4_0 quant).
It can be used in ComfyUI with this custom node or with Forge UI. See https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050 to learn more about Forge UI GGUF support and also where to download the VAE, clip_l and t5xxl models.
Advantages Over FastFlux and Other Dev-Schnell Merges
Much better quality: you get much better quality and expressiveness at 8 steps compared to Schnell models like FastFlux
CFG/Guidance Sensitivity: Since this is a DEV model, unlike the Hybrid models, you get full (distilled) CFG sensitivity - i.e., you can control prompt sensitivity vs. creativity and softness vs. saturation.
Fully compatible with Dev LoRAs, better than the compatibility of Schnell models.
The only disadvantage: needs 8-step for best quality. But then, you'd probably try at least 8 steps for best results with Schnell anyway.
Which model should I download?
[Current situation: Using the updated Forge UI and Comfy UI (GGUF node) I can run Q8_0 on my 11GB 1080ti.]
Download the one that fits in your VRAM. The additional inference cost is quite small if the model fits in the GPU. Size order is Q4_0 < Q4_1 < Q5_0 < Q5_1 < Q8_0.
Q4_0 and Q4_1 should fit in 8 GB VRAM
Q5_0 and Q5_1 should fit in 11 GB VRAM
Q8_0 if you have more!
Note: With CPU offloading, you will be able to run a model even if doesn't fit in your VRAM.
All the license terms associated with Flux.1 Dev apply.
PS: Credit goes to ByteDance for the HyperSD Flux 8-steps LoRA which can be found at https://huggingface.co/ByteDance/Hyper-SD/tree/main
Description
FAQ
Comments (16)
Does the choice of Sampling Method matter?
You mean sampler? No, any sampler that works with the original model should work with gguf
How to run it in SwarmUI? it gives me error:
[BackendHandler] backend #0 failed to load model with error: Model loader for flux-hyp16-Q8_0.gguf didn't work - are you sure it has an architecture ID set properly?
Backend request #1 failed: All available backends failed to load the model.
Sorry, I'm not familiar with Swarm, but what I understand is that it's Comfy under the hood... So you probably just need the GGUF node and a different default workflow. Maybe someone from the community can help you.
.gguf file requires specific folder "unet"
Correct path: SwarmUI\Models\unet
+Edit model metadata
The K(_M) quants appear not to work on Apple Silicon (M1, M2, etc). The HyperFlux 8-steps works fine.
Q5 doesn't work on 3060 12GB.
Q4 is amazing fast.
aye my 3060 bro wasup xd
I have downloaded this for ComfyUI. I have seen that Lora's weight needs to be set to 0.125 but since I have no separate Lora, how can I set the weight for the gguf version? I'm on apple Silicon. loaded with UnetLoaded(GGUF)
You don't need to set any weight, just use it like the standalone flux model, but it will produce good results in just 8 steps.
Is there any difference in the drawing quality between Q4_K_S and Q4_K_M, and Q4_1 and Q4-0? Is it just about file size and VRAM size? thank you
Bigger models usually give better quality.
in terms of quality/speed and lora support, in my opinion this is the best flux model by far!! the 8 Steps Hyper Q4_1 model is amazing! great work buddy!
Is there any controlnet union model compatible with this checkpoint?
The original union controlnets should work with these just fine.
Its saying Stay within the 77 token boundary for the prompt. Is this need to be followed?
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.
