Precise Flux dev NF4 FP8/FP16 | CLIP and VAE included

Precise Flux dev NF4 FP8/FP16 | CLIP and VAE included - v1.0 fp16

NSFW

This model is converted to bnb nf4 by mixing Flux dev fp8 with VAE, text encoder T5xxl-fp8 and Clip-l. All included!

For those who don't care about size, also blended full dev fp16 with VAE, text encoder T5xxl-fp16 and Clip-l in NF4. All included!

From many observations, the FP8 is a bit more accurate in understanding the prompt. And having assembled this build and tested it, I came to the same conclusion.

This model is a bit more accurate than the familiar NF4 v2. I've done a lot of generation with complex prompts to assert this. And I am completely satisfied with the way this model understands me. With simple prompts I didn't notice any difference.

The model works well with lors in Forge, in which I create in. Don't forget to select in Diffusion in Low Bits: Automatic (fp16 LoRA).

In Forge I choose Euler or Flux Realistic samplers, Schedule type: Simple. CFG: 1, number of steps 20-30.

Generated outputs can be used for personal, scientific, and commercial purposes as described in the flux-1-dev-non-commercial-license

Description

This model is converted to bnb nf4 by mixing Flux dev fp16 with VAE, text encoder T5xxl-fp16 and Clip-l. All included!

FAQ

Comments (36)

EKKIVOKSep 15, 2024· 2 reactions

CivitAI

hi, fantastic models! one question, based on your personal use wich model do you use ? fp8 or fp16 what is the best ?

Abzaloff

Author

Sep 15, 2024· 1 reaction

Hi. Thanks for the review! I mostly use FP8 as it is much lighter and faster. And it understands prompt perfectly. FP16 I use occasionally. Mostly to compare results. FP8 is completely satisfactory for me)

LazmanOct 31, 2024

@Abzaloff You say it understands prompts better, I'm curious, how much Vram you got? If you've got 16 like I do, then chances are, using FP16 probably barely scrapes by on the Vram, and doesn't have as much room left for computations to decipher the prompt. Even on 24 Vram this may be the case, depending on the settings you use. When I was still on my 8gb laptop, I ultimately found that I preferred using SD1.5 for generations over SDXL, cuz it had so much more Vram to use that I could bust out 8kx8k images without even upscaling.

EKKIVOKOct 31, 2024

@Abzaloff thanks a lote mate :)

Abzaloff

Author

Oct 31, 2024

@Lazman I have 16 GB of video memory and 64 GB of RAM. The model works great on both Forge and Comfy

LazmanOct 31, 2024

@Abzaloff Ah, but that's why you're getting better results with FP8 over FP16, cuz your entire Vram isn't being taken up just loading the model. Someone with 24gb vram would prob get better results on FP16. Sadly, Nvidia/AMD conveniently removed SLI/Crossfire support just as AI went mainstream.. So it's more difficult for us brokies to build high performance AI rigs.

PS: 64gb system ram is totally overkill. I'm tempted to get 32 more myself when I get the chance, but frankly, it's more to complete the ARGB effect than for needing the ram itself. I don't even think I've maxed out my 32 gb yet..

dandeltoubusSep 15, 2024· 4 reactions

CivitAI

Very good model. I run it with a RTX3060 GPU, 6GB VRAM, with good results

garbitSep 15, 2024

CivitAI

Does it work with comfyui and loras?

Abzaloff

Author

Sep 16, 2024

Hi! I don't know. I only work for Forge. I would be grateful if you would test and give me an answer.

MescalambaOct 23, 2024

ComfyUI yes, LORA no.

LazmanOct 31, 2024

@Mescalamba I gotta know, how did you get it working in ComfyUI. I've been beatin my head against a wall with this thing. I even loaded one of the images from the OP for the workflow, but every time when trying to run it, I get..
Error(s) in loading state_dict for Flux: size mismatch for img_in.weight: copying a param with shape torch.Size([98304, 1]) from checkpoint, the shape in current model is torch.Size([3072, 64]). size mismatch for time_in.in_layer.weight: copying a param with shape torch.Size([393216, 1]) from checkpoint, the shape in current model is torch.Size([3072, 256]). size mismatch for time_in.out_layer.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for vector_in.in_layer.weight: copying a param with shape torch.Size([1179648, 1]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for vector_in.out_layer.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for guidance_in.in_layer.weight: copying a param with shape torch.Size([393216, 1]) from checkpoint, the shape in current model is torch.Size([3072, 256]). size mismatch for guidance_in.out_layer.weight: copying a param with shape torch.Size([4718592, 1]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for txt_in.weight: copying a param with shape torch.Size([6291456, 1]) from checkpoint, the shape in current model is torch.Size([3072, 4096]). size mismatch for double_blocks.0.img_mod.lin.weight: copying a param with shape torch.Size([28311552, 1]) from checkpoint, the shape in current model is torch.Size([18432, 3072]). size mismatch for double_blocks.0.img_attn.qkv.weight: copying a param with shape torch.Size([14155776, 1]) from checkpoint, the shape in current model

Abzaloff

Author

Oct 31, 2024· 1 reaction

@Lazman , I launched it on Comfy. Here are the workflows:

https://drive.google.com/drive/folders/1WDrdE3rJfUzjH1mJ52VhWAUfV98oKCtj?usp=drive_link

LazmanOct 31, 2024· 1 reaction

@Abzaloff Lol.. I actually got it working just before I came on here just now. I had an epiphany when I woke from sleep, to try the NF4 loader node (that I only found out about last night after writing the post you're responding to) in the normal sdxl workflow, and viola, it worked. Just gotta set the CFG to 1. at 8CFG, the image is blurry. IDK why the info on how to use these things is so obscured. I only stumbled onto the NF4 loader node by chance when running searches in the comfyui manager for other things. I don't think I've seen one person mention it otherwise..

But using the NF4 loader node in the flux workflow does result in an error. Not as long as the one above, but not a success either.

DD_Ai_artSep 16, 2024

CivitAI

I'm confused.... how do you use CFG 1 and negative prompting?

If using destiled CFG, negative prompt is disabled, if using normal CFG at 1, negative prompting is disabled, so, what's the catch?

Abzaloff

Author

Sep 16, 2024· 1 reaction

This is how the Flux model works. You can increase the CFG, but you will also have to increase the number of steps significantly. In most cases, negative prompt is not needed in this model

DD_Ai_artSep 16, 2024

@Abzaloff I know how it works, i simply ask how u use negative prompting with that settings, on your images? Is it simply leftover from trying old prompts with new model, or?

Tnx!

Abzaloff

Author

Sep 16, 2024· 1 reaction

@DD_Ai_art, I see your point) I have negative cues enabled by default, or get there when using styles. But they are not taken into account during generation, they are just included in the output

BiotorchMar 5, 2025

I usually use 1.1

RenkuyaSep 17, 2024· 2 reactions

CivitAI

Crazy good.

Abzaloff

Author

Sep 17, 2024

Thanks!)

inteligence1020Sep 18, 2024

CivitAI

Hello I use comfyui the nf4 does not work with lora could you upload use normal fp8 version?

Old_DiamondOct 12, 2024· 2 reactions

CivitAI

A very nice Flux model for making gorgeous artwork! Great job! Thanks for the model!

Abzaloff

Author

Oct 12, 2024

Thanks, bro!))

Zakman99Oct 22, 2024· 1 reaction

CivitAI

A good model that can both realistic pictures and fantasy pictures to create, thank you very much

Abzaloff

Author

Oct 22, 2024

Thank you for your good feedback!!!

dc802Oct 29, 2024· 1 reaction

CivitAI

Small but noticeable improvement on standard NF4

Starfish88Nov 15, 2024

CivitAI

Hi, what model work with it as a refiner?

CivitXIIFeb 26, 2025· 2 reactions

CivitAI

Precise Dev 8 and 16 are so good! Thanks!

x_TinMan_xApr 5, 2025

CivitAI

Flux "Realistic" Sampler? Where would I find that..?

Abzaloff

Author

Apr 6, 2025· 1 reaction

This sampler was in Forge

ranger89May 17, 2025· 2 reactions

CivitAI

The model is great! The detailing is very accurate, it is a pleasure to work with such a model!

Abzaloff

Author

May 17, 2025

Thanks!!) 😍

ranger89May 20, 2025· 1 reaction

It can't be otherwise! It's really a great model worthy of respect!

playmusicJun 4, 2025

CivitAI

any ideas about how to use this model in swarmui? it keeps failing there. no problems in forge.

mad_rookySep 12, 2025· 1 reaction

CivitAI

Wow, the performance particularly of the "fp16" is amazing. This is why every fine tune should have at least one nf4 port, it simply is much better than any gguf or other lower quants.
And this is not about raw GPU power. I have a 4070 TI Super, it has excellent rendering speed, but is limited to 16gb vram. When I use a full fp16 or even a fp8 gguf the speed gets absolutely destroyed by the memory management. It takes ages just to reload the model on small adjustments. But this one just works without the absurd offloading issues, and the quality is totally fine, if not excellent if the memory savings are considered.

I am not sure what fp8 or f16 in this particular case means as I can't see much difference in memory usage. I would guess this is the source, not the output?

Abzaloff

Author

Sep 13, 2025

Yes, these are the incoming models that were used for conversion to NF4.

Checkpoint

Flux.1 D

by Abzaloff

Download (Beta) View on CivitAI