CivArchive
    SDXL 4GB/2GB (Improved FP8 & GGUF) - FP8 Full Checkpoint
    NSFW
    Preview 35360984
    Preview 35360979
    Preview 35360982
    Preview 35360981

    SDXL 4 Step (FP32 with Improved UNET)

    • Note I used the 32bit CLIP-G

    • Refiner used in workflow also 32bit GGUF

    • Updated CLIP-L

    • For 4 Step use at CFG 1.0 - Load a image for workflow

    • For NSFW images the refiner should not be used

    • BRSGAN 2x can be found on Google Drive

    2GB GGUF from FP32

    Note 2GB requires separate clip and GGUF support, 4GB FP8 is ready to use in any SD GUI

    • Refined with baked FP32 Lora's

    • Quantized from the FP32 SDXL model for less loss

    • For GGUF Download separate SDXL CLIP-G and CLIP-L and VAE

    4GB SDXL (Full Checkpoint)

    • Custom CLIP is not Quantized

    • Custom UNET quantized to FP8 allowing for a balance of size and quality

    • Works in FORGE, Comfy-UI and Auto-1111

    • Works with LORA's

    • Beta/Deis is a good choice for img 2 img up-scaling

    Both models have improved anatomy (Uncensored) for females however GGUF version does not do well with males.

    Description

    FAQ

    Comments (13)

    3567304Oct 19, 2024· 1 reaction
    CivitAI

    interesting work first pony now SDXL with 4 gb, i cant wait for try your 4 step versions of these models to draw various results, Good work man appreciate the hard owrk and effort

    Felldude
    Author
    Oct 19, 2024

    The Step Models rely on timestamp/schedulers, given the lora is less the 400MB and in intended to work with LCM I am not sure it would merge in well.

    moocoopOct 20, 2024

    Can someone help me understand the goal/value/benefit here? I truly don't understand the significance of the 4GB or the added written detail. Thanks!

    Felldude
    Author
    Oct 20, 2024

    @moocoop The benefit would apply to 6GB 2060 and 3050 users. Users who keep multiple models loaded into VRAM, such as PONY + XL or PONY + FLUX and have to watch VRAM usage

    punkbuzter340Oct 20, 2024· 2 reactions
    CivitAI

    What is this sorcery ?

    Felldude
    Author
    Oct 20, 2024· 2 reactions

    The type that uses 2 bits of precision compared to 23 bits on the original FP32 model. The sorcery is the fact they can predict what the number would have been with any measure of accuracy thanks to graphing

    punkbuzter340Oct 21, 2024· 1 reaction

    @Felldude What else did you do... Did you use the original SDXL model and tweaked it or you used a finetuned model or merged model, or you trained your own images on top of something?... Cuz the results are impressive.

    Felldude
    Author
    Oct 21, 2024· 3 reactions

    @punkbuzter340 The clip and unet have both been modified, it has multiple FP8 trainings baked in

    ViennarOct 23, 2024· 1 reaction
    CivitAI

    I don’t understand why use the fp8 model if with the --medvram argument it is not loaded into memory, and without this argument, it works for me about 6 times slower than the classic fp16 with the --medvram argument. The original idea was to work with insufficient video memory?

    Felldude
    Author
    Oct 23, 2024· 1 reaction

    It was to enable those with 6GB RTX cards to fit into VRAM, possibly IPEX on Integrated Intel GPU's - Or those using a feed into FLUX, you might be able to fit a 4GB alongside a NF4 version of FLUX into a 16GB card

    Felldude
    Author
    Oct 23, 2024· 1 reaction

    Only a 4090 has accelerated FP8 attention and pytorch doesn't support it yet so we still upcast to FP16 or BF16

    ViennarOct 23, 2024· 1 reaction

    @Felldude I understand, thanks for the explanation

    amazingbeautyNov 12, 2024

    as far i know the fp8 isn't that really good with some H.W , might be sort old H.W

    Checkpoint
    SDXL 1.0

    Details

    Downloads
    1,317
    Platform
    CivitAI
    Platform Status
    Available
    Created
    10/19/2024
    Updated
    5/13/2026
    Deleted
    -

    Available On (1 platform)

    Same model published on other platforms. May have additional downloads or version variants.