Fast & Low VRAM Flux.1D [4Gb tested, checkpoint]

Fast & Low VRAM Flux.1D [4Gb tested, checkpoint] - Based on DevMode

NSFW

This is a full checkpoint with 8-bit precision (FP8) based on Flux Fusion (which gives best results for me, and also there are variations with some other models). I made the checkpoint just for my own convenience. Unet, Clips and VAE merged to the checkpoint are described in this article.

It's okay to have 4GB VRAM and 32GB RAM to use the checkpoint. Simple testing ComfyUI workflow (.json file) is attached here as "Training Data" in .zip archive. 1024x1440 image generation takes ~80 seconds on my notebook with 4GB 3050 and i5-11400.

8 steps are enough to generate hi-res sharp images.

Just download and run it.

Enjoy.

Description

All-in-one checkpoint with DevMode v.0.3 inside.
Generally SFW

FAQ

Comments (11)

littlefluffyballJun 5, 2025· 1 reaction

CivitAI

Svd Quant 4bit Flux.Dev Generates in 45sec on 3060, image quality same as yours examples.
I cannot understand why exactly you cannot use 4bit converted version on your low end GPU? Maybe you don't know it exists?

And I do not believe that 8 steps will make any good, only if it plastic looking Shnell, which better to avoid completely.

mistporyvaev

Author

Jun 5, 2025· 1 reaction

I don't understand why should I use 4bit model while twice more precise 8bit model works fine for me.

Of course I know about low precision quantized GGUF models and I'd tried them.

crafted101Jun 11, 2025· 1 reaction

CivitAI

i am using sd forge i have a 3060 12gb vram gforce32 ram i did 20 steps and it starts to load the image gets to end then crash i get error msgs and image lost what am i doing wrong

mistporyvaev

Author

Jun 11, 2025

Sorry, but I've never used Forge, and I can't say exactly what your problem is. However, my experience using A1111 and SD.Next showed that such a problem is caused by a general lack of optimization of this software. The last stage of generation, when VAE (latent to image conversion) is launched, is especially critical, it is at this stage that crashes and complete loss of results occur due to non-optimal use of memory. My config: mobile RTX 3050 with 4 GB VRAM and 32 GB RAM. The only software on which I can generate both images with Flux and video with Van with this config is ComfiUI with custom nodes that correctly offload model weights to RAM. For example, without the Tiled VAE node, I would hardly be able to generate at least one image at all.

imsostupidatu102Jun 16, 2025· 1 reaction

CivitAI

Thank you. I'm using this model with 4GB VRAM RTX 2050 and 16GB RAM, it takes about 120-150sec with amazing image quality compared to Q4_K_S version,

StreamTabulousAug 25, 2025· 1 reaction

CivitAI

painfully slow at recommended settings, end of the day models are compressed so 15gig is way to big even on 12gig vram its still going to load to system ram so same as your times due to that. a 6gig model would be nice.

4 steps 15sconds add lora it crawls due to ram usage. nice model but not low vram.

mistporyvaev

Author

Aug 25, 2025

just use any GGUF UNet, not this checkpoint

mistporyvaev

Author

Aug 25, 2025· 1 reaction

look for GGUF that will fit your VRAM here https://civitai.com/models/630820/flux-fusion-v2-4-steps-gguf-nf4-fp8fp16

goldennyks76Sep 22, 2025· 2 reactions

CivitAI

I have a 3080 10GB graphics card and 32GB of RAM.