CivArchive
    Flux.1-Dev Hyper NF4 + Flux.1-Dev BNB NF4 + Flux.1-Schnell BNB NF4 - Flux.1-Dev Hyper NF4
    Preview undefined

    Description

    FAQ

    Comments (93)

    sevenof9247Sep 8, 2024· 4 reactions
    CivitAI

    for me works 12 steps better on

    sampler -> [Forge] Flux Realistic

    RalFinger
    Author
    Sep 8, 2024· 1 reaction

    Thanks for sharing!

    MirabilisSep 9, 2024· 1 reaction

    Yeah I think 8 steps is akin to 20 steps normal, but if you want to add more detail the 12 steps with 6 for Hi Rez is the way to go for sure.

    1692662Sep 8, 2024· 6 reactions
    CivitAI

    whats the difference between Hyper and BNB?

    luchetesSep 8, 2024· 3 reactions

    lower steps needed, faster, less quality

    1692662Sep 9, 2024

    @luchetes oh. Got it now. Thank you.

    htfgfghSep 15, 2024· 1 reaction

    which one is which?

    zerocool22Sep 8, 2024· 1 reaction
    CivitAI

    I'm currently using: flux-hyp8-Q8_0.gguf.

    These could give better quality or?

    _Jarvis_Sep 12, 2024· 2 reactions

    No, this is for weak graphics cards

    SpawnBTCSep 12, 2024· 8 reactions
    CivitAI

    Guys, this model (`Flux.1-Dev Hyper NF4`) already has a VAE. So, you don't need to use ae.safetensors, for example. Good creations to everyone! s2

    blo01Sep 12, 2024

    how to use in comfyui ? can't load the model

    SpawnBTCSep 13, 2024· 3 reactions

    @blo01 

    The flux models need to be placed inside the UNET folder: /models/Unet, unlike the others, which are stored in the checkpoint and/or stable diffusion folders.

    VectorVandalSep 14, 2024· 2 reactions

    @SpawnBTC I moved it but how do you use it in comfyui?

    DashaLuniowaSep 12, 2024· 1 reaction
    CivitAI

    "prompt_id": "f69ce5c2-6885-4824-900d-2006ce9a4e7b",

    "node_id": "CheckpointLoader_Base",

    "node_type": "CheckpointLoaderSimple",

    "executed": [],

    "exception_message": "Error(s) in loading state_dict for Flux:\n\tsize mismatch for img_in.weight: copying a param with shape torch.Size([98304, 1]) from checkpoint, the shape in current model is torch.Size([3072, 64]).\n\tsize mismatch for time_in.in_layer.weight: copying a param with shape torch.Size([393216, 1])

    dhillon_karan03405Oct 29, 2024

    Did you find solution to this?? I'm facing the same issue

    DashaLuniowaNov 1, 2024

    @dhillon_karan03405 Unfortunately not. The problem remains relevant

    RalFinger
    Author
    Nov 1, 2024

    @dhillon_karan03405 @DashaLuniowa hey guys. Can you explain in more details what UI you are using, what you are trying to create? Is it just the base model giving you this error (never seen before). More context would help to see if we can figure out the error :)

    DashaLuniowaNov 13, 2024· 2 reactions

    @RalFinger Solved. The problem was that in my Stability Matrix shell, this error appeared due to the fact that the standard loader was used. The NF4 loader was needed, which solved it.

    For others who do not have a choice of loader, you need to make sure that the model is in the correct folder (it will help some, I hope)

    UPD: Unlike FluxDev, NF4 model it is only partially loaded. And then there is no start generating It seems 6 GB of video memory is not enough for him, which is strange

    RalFinger
    Author
    Nov 19, 2024

    @DashaLuniowa thank you for the clarification and the update!

    AcadiaAugust178Sep 21, 2024· 1 reaction
    CivitAI

    I really wanted to like it but I'm getting an extreme level of sameface even when prompting heavily for differences.

    dc802Sep 21, 2024

    Welcome to Flux

    5c0f4n0Sep 23, 2024

    Use a lora.

    AcadiaAugust178Sep 23, 2024

    @5c0f4n0 What lora have you found that helps with the issue?

    79468Oct 6, 2024· 1 reaction
    CivitAI

    Is already have lora stack for Hyper-NF4 in ComfyUI?

    ApexArtist1Oct 21, 2024· 4 reactions
    CivitAI

    this goes in checkpoint folder or Unet folder ?

    khajiirahNov 12, 2024· 1 reaction

    Flux Models (comfyui) go into models/unet.

    5550139Oct 25, 2024· 4 reactions
    CivitAI

    it would be pretty cool if you could add a line or two in the description of the models (nf4, l8a, gguf) where they should be saved and whether everything has already been baked in. i've just added back NF4 support to my workflow and it was a bit confusing tbh :P
    anyway, thanks for your work <3

    RalFinger
    Author
    Oct 25, 2024· 1 reaction

    Good idea, I can rework that!

    5550139Oct 25, 2024· 1 reaction

    @RalFinger that would be awesome, thanks a lot! 👌

    knfelOct 28, 2024· 6 reactions
    CivitAI

    Someone help me, I want to use flux, I have a 12 gigabyte 3060 and an AMD Ryzen 5 pro 4650 g processor, it is possible to move flux with that and if possible, I need you to go, thanks for the help, I'll start in this world.

    so_ha_Oct 29, 2024

    I use Ryzen 5 3600, gtx3060. it work well. hyper version is working with 4GB vRAM

    knfelOct 29, 2024

    @so_ha_ What else do I need to download friend, what model are there configurations, can you send me that data friend :D

    tenejNov 18, 2024· 5 reactions

    @knfel You must download clip_l.safetensors and t5xxl_fp16.safetensors and put them in models/clip folder, and ae.safetensores and put this in models/vae folder. You can't run flux in Automatic1111, so you should use ComfyUI or WebUI Forge. In ComfyUI in workflow, you should se noodes Dual Clip Loader, for clip_l.safetensors and t5xxl_fp16.safetensors and VAE node for ae.safetensors. In WebUI Forge, on the top you should see VAE/Text Encoder option, you need to choose all 3 of them in that option, and on right side you should see option Diffusion in low bits, where you choose option Automatic (Fp16 LoRA), otherwise loras wont work. You should put Euler as a sampler and Simple as a sheduler (you can use differenet samplers too, also shedulers, but pay attention that most of them will result as a noise only. Sampling steps is 20 (or 8 with this model), Distilled CFG scale should be 3.5, CFG scale 1. Also you need to reduce GPU Weights option if you have problem with memory.

    knfelNov 19, 2024

    @tenej thank you very much friend for the detailed explanation thank you very much

    kaaswafelJan 11, 2025

    @tenej amazing explanation, thank you so much!

    Unsterblich82Oct 30, 2024· 1 reaction
    CivitAI

    HI . i found this https://civitai.com/models/768836/pilgrimflux , has the same hash as your model X-flux1-dev-bnb-nf4-v2

    cluster1500Nov 4, 2024· 2 reactions
    CivitAI

    I assume this does not support Flux-dev LoRAs?
    I get a ".to() does not accept copy argument" error when trying to load Flux LoRA with the KSampler.
    SD LoRA go in the Sampler without an error, but -not very surprising- seem not to have an effect on the result.
    Am I doing something wrong? None of the example images seem to feature a detailer or character LoRA for me to try.

    I use ComfyUI with silveroxides UNET loader.

    Thanks in advance! Great model btw!

    mrgebienJan 19, 2025

    Trying to figure out the same thing right now lols :l

    burnera679889Apr 11, 2025

    has anyone figured this error out, i'd love to use it in SwarmUI without having to worry about wrangling Comfy Node workflows, but it not being able to use loras is a dealbreaker

    Ishan_ArtsNov 6, 2024· 3 reactions
    CivitAI

    what's the difference between hyper and the previous version bnbf4 (I'm new to all this)

    HuayinyueJun 12, 2025

    +1

    PBtheCreatorNov 8, 2024· 6 reactions
    CivitAI

    I love this checkpoint. It allows me to run flux on my 1080ti without much problem. At 10 steps it generates amazing images and the speed is manageable.

    apprewiredDec 26, 2024

    Hi - I love the images you create here and wondered if you wouldn't mind answering some questions to help a noob? I've been reading lots of articles and watching videos to figure this out. I use ComfyUI and have things almost working - but somethings are different, for example this page: https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4 says the Loader should be named 'CheckpointLoaderNF4' however, I only see 'Load NF4 Flux Checkpoint' in ComfyUI. I know, probably a noob question, but there are other small differences as well - for example, I'm trying to base my workflow off of the Flux examples here: https://github.com/comfyanonymous/ComfyUI_examples/tree/master/flux - but I don't see an option for 'Guidance' which is something that you specify in the images you've loaded. It also isn't totally clear if the whole thing is now completely deprecated based on the github link above saying that this should move GGUF - If you have any suggestions - thank you in advance~!

    PBtheCreatorDec 26, 2024· 1 reaction

    @apprewired No idea. I use Forge (based on automatic1111

    LokiPoki1Nov 11, 2024· 21 reactions
    CivitAI

    So what is the difference with these models and the other flux checkpoints? I see so many and dont know which to use. Do I have to use this one since I have 12gb vram, or can I get away with using the normal basic fp8 model? Or will the outputs be the same anyway. Im overwhelmed with so many flux checkpoints.

    kiryanton930Nov 14, 2024· 3 reactions

    Me too, I don't understand anything

    HyokkudaNov 17, 2024· 44 reactions

    These Flux.1 model variants utilize different quantization techniques, such as FP16 and NF4, to optimize performance, catering to various deployment scenarios and hardware limitations.

    FP16 refers to 16-bit floating-point precision, which reduces the memory footprint and computational requirements compared to the standard 32-bit floating-point (FP32) precision. This reduction is beneficial for deploying models on hardware with limited resources. In most scenarios, FP16 precision leads to faster rendering times compared to FP32 (32-bit precision) because it reduces memory usage and computational load. This is particularly true on GPUs optimized for FP16 arithmetic, such as modern NVIDIA GPUs with Tensor Cores. The trade-off is that reducing precision can lead to minor quality loss in the model's calculations. However, this precision loss is negligible for many use cases in image generation, and the speed gain is worth it.

    NF4 stands for 4-bit NormalFloat quantization, a technique that further compresses model parameters to 4 bits. This compression significantly decreases model size and enhances inference speed, making it advantageous for deployment on devices with constrained memory and processing capabilities. NF4 quantization is more aggressive than FP16, reducing weights to 4-bit representations. This further reduces memory requirements and increases inference speed by minimizing the data that the model needs to handle. The downside is that compressing to 4 bits can potentially lead to loss of detail or lower accuracy in some model outputs. You might notice that results aren't as sharp or precise for highly detailed or complex image generation compared to higher-precision weights. It is generally used when speed and efficiency are more critical than having the highest possible quality. This can be particularly useful for rapid prototyping or running models on hardware with minimal resources (e.g., older GPUs).

    Checkpoint-trained models are pre-trained models saved at specific points during the training process. They serve as starting points for further fine-tuning or can be used directly for inference tasks. These models are not base models; instead, they are derivatives fine-tuned for particular tasks or optimized for specific hardware configurations.

    I hope this helps!

    kiryanton930Nov 17, 2024

    @Hyokkuda What is the difference between Dev and Schnell?

    What is Hyper?

    What is BNB and GGUF?

    Your model is the only one that fits into my RTX3060 video card with 12 GB of VRAM (and there is still plenty of space left for lore). Generation of 8 steps takes 40-60 sec.

    The original models do not fit into the memory and this is why generation takes up to 7 minutes.

    In all models, the number of steps is not less than 20, otherwise the quality is greatly reduced (although you wrote 8). And only with the FLUX.1-Turbo-Alpha lore I was able to reduce to 8 steps.

    Your models are not cut and all work. I do not understand why other people cut them, while they begin to weigh attractively up to 10 GB, but in fact the models do not work without additional CLIP, T5, VAE (by the way, what is this for?). That is, after adding all these cut volumetric parts, the model again weighs a lot.

    I generated the last images I posted with your Flux.1-Dev Hyper NF4 checkpoint, but the site loses it when adding an image and there is no way to change it.

    HyokkudaNov 17, 2024· 50 reactions

    @kiryanton930, I want to apologize in advance for that wall of text.

    Also, I am unfortunately not the creator of this base model.

    As for the Dev versions of models, they are typically aimed at ongoing development, which means they may include new experimental features, hyperparameter settings, optimizations, or weights that haven't been fully tested or tuned. They might also allow the generation to be more flexible but could be less stable.

    Schnell is German (no idea why...) which stands for "fast." The Schnell versions of models are likely optimized for speed and efficiency, possibly with reduced computational requirements or cut-down architectures that aim for quicker inferences. These models might achieve faster generation times, but this sometimes comes at the cost of slightly lower quality or a reduction in certain details.

    Hyper refers to a special configuration of the model that's using what might be called a Hypernetwork. Hypernetworks are a kind of add-on to the main model that enables it to generate more nuanced or specific details while keeping the core model architecture intact. They’re useful for capturing certain styles or types of data without retraining the main model entirely.

    BNB is an abbreviation of BitsAndBytes. It’s a library used for optimizing GPU memory usage when loading models. It can help to quantize the models to smaller bit representations, such as 8-bit or even 4-bit, reducing VRAM requirements while retaining as much quality as possible.

    GGUF likely referring to a quantization format or specific optimization method, but it's possible that GGUF is a variant or extension of a particular file format used for optimizing model size and VRAM utilization. It could be specific to a library or a technique that further reduces the memory footprint by compressing weights.

    When a model is "cut," it’s essentially stripped down to reduce the size, often by removing auxiliary components or dependencies like CLIP, T5, or VAE.

    CLIP is used to better understand or guide what should be generated based on input text. If missing, the model might struggle to correctly interpret prompts. I could better explain what CLIP is for you and how it works if you want, but that's gonna be a long topic.

    T5 is a text-to-text transfer transformer used in some models for handling more advanced text understanding or prompt generation. Without it, models can lose context understanding.

    VAE is used to encode and decode the latent space representation into images, which helps in final output quality. If missing, the quality can suffer significantly.

    When models are trimmed to reduce size, the aforementioned components are often removed, resulting in a smaller file, but users then need to provide these components separately to use the model properly, which can be frustrating because it negates the space-saving benefits.

    Your GPU benefits from models that have been carefully tuned to fit within the VRAM without compromising core functionality. FLUX.1-Turbo-Alpha model's efficiency allows it to generate with only 8 steps, resulting in relatively fast performance. This kind of setup benefits from optimization techniques and a more straightforward architecture, enabling lower step counts. Other models, which require 20+ steps to produce good quality results, are likely not optimized the same way. However, reducing step counts can drastically affect quality if the model isn't designed for fast convergence.

    Also, I am not sure I understand your last sentence about the website losing it when adding an image? If you are having trouble generating images via Civitai website, you can always try Stable Diffusion Forge. Stable Diffusion Forge is specifically designed to enhance resource management and speed up inference, making it well-suited for GPUs like the RTX 3060. Therefore, integrating that into your workflow can provide a more efficient and streamlined experience, especially when working with hardware that has limited VRAM. If you're using an image as a reference and encounter issues, try opening it in an image editing program and save it in a different format. This can help resolve potential color profile or metadata issues that might interfere with the rendering process.

    Again, I hope this answers most of your question. And I am sorry again for this large wall of text. u_u;

    kiryanton930Nov 17, 2024

    @Hyokkuda "Also, I am not sure I understand your last sentence about the website losing it when adding an image?"

    I generate an image on my PC and add it to the site. Usually all used lore and checkpoints are displayed on the right in the description, but with flux the description of some lore and checkpoint disappears. I used to think that it was because I downloaded the checkpoint from another site, but now I downloaded the checkpoint from this site, but nothing has changed, the checkpoint is missing in the description, it's some kind of bug.

    Forge is what I use

    HyokkudaNov 17, 2024

    Ah, I understand now, @kiryanton930. I believe this issue may be related to Stable Diffusion Forge. When revisiting some older prompts, I often find that certain key settings are missing, which I need to determine manually—something that can be quite frustrating at times. :(

    I recommend trying a different software, using the exact same prompts and settings, and then comparing the outputs using PNG Info or the FileOptimizer software to read (or edit) the metadata.

    kiryanton930Nov 17, 2024

    @Hyokkuda 

    Well, all the information is present in the metadata.

    Maybe the problem is in the formatting, but I don't create the metadata, it is created automatically.

    Maybe we can somehow call the authors of the site so that they can determine what the problem is.

    An example of metadata in which information about the model disappeared after uploading to the site:

    Steps: 8, Sampler: [Forge] Flux Realistic, Schedule type: Beta, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 502053333, Size: 896x1152, Model hash: 6e3e5990e9, Model: flux1DevHyperNF4Flux1DevBNB_flux1DevHyperNF4, Lora hashes: "flux.1_lora_flyway_Epic-detail_v2: 2DBB61AC85E1, FLUX.1-Turbo-Alpha: e5e0c5d5201b, Comic book V2: 9a710c809fdb", Discard penultimate sigma: True, Beta schedule alpha: 0.6, Beta schedule beta: 0.6, NGMS: 4, Version: f2.0.1v1.10.1-previous-545-gf5190349, Diffusion in Low Bits: bnb-nf4 (fp16 LoRA)

    HyokkudaNov 20, 2024

    Sorry for the delay, @kiryanton930. Is it happening for every generated image or just one image? Or is it for every image from this model in particular? You could record a short video to upload and share your experience. Or share your picture with someone else who can try to upload it and see if the issue occurs for them, too.

    kiryanton930Nov 20, 2024

    @Hyokkuda For each, I created a ticket with the developers, their answer is "yes, we know it doesn't work, we can't do anything"

    HyokkudaNov 20, 2024

    Oh... well, that is pretty underwhelming, @kiryanton930. :/ Sorry you have to go through that.

    funtimequest363Nov 20, 2024· 2 reactions

    @Hyokkuda Can loras that was based on flux.1 D use flux.1 d hyper as base model?

    HyokkudaNov 20, 2024· 4 reactions

    To my knowledge, @funtimequest363, yes, LoRAs trained on Flux.1D can be used with Flux.1D Hyper as the base model. Since the two models are related, using Flux.1D Hyper might result in enhanced detail or better fine-tuning control compared to the original Flux.1D, depending on the modifications in the Hyper version.

    schlomoDec 26, 2024· 4 reactions
    CivitAI

    seems like it actually lets me run flux. nice thanks

    slideslide293Jan 18, 2025· 2 reactions
    CivitAI

    Very natural imitation of photos from SLR cameras. Simply amazing!

    MetaGenJan 26, 2025· 2 reactions
    CivitAI

    I love this model as a base for generating from flux dev base with less censorship and fast performance and fp16 lora support.

    Now I want to create a character lora from this checkpoint, could you guys share your best FluxGym settings?

    shineseaFeb 7, 2025· 3 reactions
    CivitAI

    How can I run Flux.1-Dev Hyper NF4 in Comfyui?

    KrooksKrooksFeb 8, 2025· 5 reactions

    You need the custom loader instead of the core one. See Cluster's comment.
    If you have Manager installed or something, you can just search for "NF4", see if any are the new loaders.

    dschonichMar 3, 2025· 1 reaction

    Try CheckpointLoaderNF4 (bitsandbytes_NF4)

    stavros247Apr 10, 2025· 2 reactions

    thanks guys, NF4 loader helped!

    RalFinger
    Author
    Apr 10, 2025· 1 reaction

    @stavros247 it is also in the description of the model, you just need to read

    rafaelldestiloFeb 18, 2025· 14 reactions
    CivitAI

    works perfectly for me RTX 3060ti 8gb vram, maybe I'll abandon all the SD1.5 and SDXL models I have and just stick with flux, I'm happy with the results, I have more than 190GB of models on my pc, I'm using Forge

    gorem23834446May 12, 2025

    which version is it? please?

    ZodiakFeb 20, 2025· 2 reactions
    CivitAI

    Any tutorial how to instal and use please ?

    soapchegFeb 27, 2025· 8 reactions
    seriousaboutdesign446Mar 4, 2025· 4 reactions
    CivitAI

    with the Flux.1-Dev Hyper NF4: im looking at 13.3/it @ 936 x 1856 thats with an upscalyer...does this make sense? it is quite slow.... rtx 3060 12g VC, thanks

    4305599Apr 26, 2025

    Yes, it makes sense. Use one of the default resolutions (832x1248, 896x1152, 1024x1024). It doesn't really make sense to use a higher resolution than that. Upscale the image if you want to have it bigger, but don't generate it at such a high resolution.

    ranger89May 4, 2025

    Maybe the generation settings are not correct? I have 2060super 3.8s\it on my card. The picture averages 1 minute 40 seconds.

    todiy29Apr 18, 2025· 3 reactions
    CivitAI

    My favorite model!2060super very good results,average image generation time 1min 35sek

    jazara930Apr 21, 2025· 1 reaction
    CivitAI

    great work! nice checkpoint. I love it....

    anonymzApr 30, 2025
    CivitAI

    zluda not supported

    nikoszMay 2, 2025· 2 reactions
    CivitAI

    can someone please share his/her workflow for comfyui because I cannot make this work no matter what I've tried? Thank you in advance

    RalFinger
    Author
    May 4, 2025

    Here is the link to the comfy nodes. Its all in the description of the model.

    maloofmxMay 8, 2025· 9 reactions
    CivitAI

    Hello, has anyone using Flux NF4 All in One models in Comfyui been getting errors on their workflows since last update? Error: mat1 and mat2 shapes cannot be multiplied (1x1 and 768x3072)? Is there a fix or just wait for an update?

    6400850May 12, 2025

    Yes broken

    RalFinger
    Author
    May 13, 2025

    Hey guys, this sounds like a Comfy or Node problem, the model is fine

    maloofmxMay 21, 2025· 2 reactions

    @RalFinger hello, yes I ment comfyui, not the model. Had to move around some things in Forge to be able to use them with a rtx5080

    theno1Jun 14, 2025· 2 reactions

    Comfy suck. All they gonna do is advertise themselves to support latest and greatest models, but at its core, they're shallow and don't give a damn about foundational improvements. I was having the same issue while using bits_n_bytes loader, which is created by the devs. Not even a single person on their discord responding to my issue. If you wanna use Flux on Comfy, stick to GGUF.

    d4nt3gm136Jun 24, 2025· 2 reactions
    CivitAI

    I'm not able to run comfyui through colab pro with gpu l4 and A100, can someone please help me which workflow should I use?

    goldiegrace99980Sep 26, 2025

    try installing it locally by desktop version or portable one, use lower end models if your graphic card is feeble. Use SD1.5 models with baked in vae, or go for SDXL models if your graphic card can take it. SDXL is somewhat comparable to Flux schnell, use Juggernaut or Dreamshaper XL models

    Only_Fuuka_OFJul 7, 2025· 1 reaction
    CivitAI

    loras don't work for some reason

    RalFinger
    Author
    Jul 8, 2025

    working fine for me

    AzathothSep 3, 2025· 1 reaction
    CivitAI

    Hello, I'm getting error mat1 and mat2 shapes cannot be multiplied (1x1 and 768x3072) with any workflow I'm trying.

    Do you know if there's a fix ?

    Or is comfy nf4 broken ?

    I've tried aio models and unet , nothing works

    RalFinger
    Author
    Sep 5, 2025

    Check if you used the correct model selection "FLUX" in the clip loader

    NIMble_FroggSep 11, 2025· 2 reactions
    CivitAI

    Can you tell me which is better to use in Forge?

    Sampling method -- ?

    Schedule type --?

    thank you =)

    Axiom432Sep 11, 2025· 22 reactions
    CivitAI

    My favorite grey square generator. Only 10+GB to draw all the variation of your favorate grey square.

    RalFinger
    Author
    Sep 12, 2025· 2 reactions

    Skill issue

    DetroitArtDudeOct 16, 2025

    This sort of thing happens when you have a misconfiguration somewhere. A good place to start is to load an image from here that looks good and modify it from there.