This is a full checkpoint with 8-bit precision (FP8) based on Flux Fusion (which gives best results for me, and also there are variations with some other models). I made the checkpoint just for my own convenience. Unet, Clips and VAE merged to the checkpoint are described in this article.
It's okay to have 4GB VRAM and 32GB RAM to use the checkpoint. Simple testing ComfyUI workflow (.json file) is attached here as "Training Data" in .zip archive. 1024x1440 image generation takes ~80 seconds on my notebook with 4GB 3050 and i5-11400.
8 steps are enough to generate hi-res sharp images.
Just download and run it.
Enjoy.
Description
All-in-one checkpoint with DevMode v.0.3 inside.
Generally SFW
FAQ
Comments (11)
Svd Quant 4bit Flux.Dev Generates in 45sec on 3060, image quality same as yours examples.
I cannot understand why exactly you cannot use 4bit converted version on your low end GPU? Maybe you don't know it exists?
And I do not believe that 8 steps will make any good, only if it plastic looking Shnell, which better to avoid completely.
I don't understand why should I use 4bit model while twice more precise 8bit model works fine for me.
Of course I know about low precision quantized GGUF models and I'd tried them.
i am using sd forge i have a 3060 12gb vram gforce32 ram i did 20 steps and it starts to load the image gets to end then crash i get error msgs and image lost what am i doing wrong
Sorry, but I've never used Forge, and I can't say exactly what your problem is. However, my experience using A1111 and SD.Next showed that such a problem is caused by a general lack of optimization of this software. The last stage of generation, when VAE (latent to image conversion) is launched, is especially critical, it is at this stage that crashes and complete loss of results occur due to non-optimal use of memory. My config: mobile RTX 3050 with 4 GB VRAM and 32 GB RAM. The only software on which I can generate both images with Flux and video with Van with this config is ComfiUI with custom nodes that correctly offload model weights to RAM. For example, without the Tiled VAE node, I would hardly be able to generate at least one image at all.
Thank you. I'm using this model with 4GB VRAM RTX 2050 and 16GB RAM, it takes about 120-150sec with amazing image quality compared to Q4_K_S version,
painfully slow at recommended settings, end of the day models are compressed so 15gig is way to big even on 12gig vram its still going to load to system ram so same as your times due to that. a 6gig model would be nice.
4 steps 15sconds add lora it crawls due to ram usage. nice model but not low vram.
just use any GGUF UNet, not this checkpoint
look for GGUF that will fit your VRAM here https://civitai.com/models/630820/flux-fusion-v2-4-steps-gguf-nf4-fp8fp16
I have a 3080 10GB graphics card and 32GB of RAM.
Production at 1440x2400 resolution takes an average of 60 seconds, while at 1024x1536 resolution, it takes an average of 30 seconds.
(I haven't changed any settings other than the resolution; using the provided workflow.)
is it okay for your config?
@mistporyvaev Yes, my config works fine with this model.
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.












