CivArchive
    SDXL_fixedvae_fp16(Remove Watermark) - refiner_fixed_vae_V2_fp16
    NSFW
    Preview 1Preview 2Preview 3Preview 4

    This is merge model for:

    1. 100% stable-diffusion-xl-base-1.0 and 100% stable-diffusion-xl-refine-1.0

    https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0

    https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0

    2. sdxl-vae-fp16-fix

    https://huggingface.co/madebyollin/sdxl-vae-fp16-fix

    you can use this directly or finetune.

    same license on stable-diffusion-xl-base-1.0

    same vae license on sdxl-vae-fp16-fix

    SDXL-VAE-FP16-Fix

    SDXL-VAE-FP16-Fix is the SDXL VAE, but modified to run in fp16 precision without generating NaNs.

    VAEDecoding in float32 / bfloat16 precisionDecoding in float16 precisionSDXL-VAE✅⚠️SDXL-VAE-FP16-Fix✅

    Details

    SDXL-VAE generates NaNs in fp16 because the internal activation values are too big:

    SDXL-VAE-FP16-Fix was created by finetuning the SDXL-VAE to:

    1. keep the final output the same, but

    2. make the internal activation values smaller, by

    3. scaling down weights and biases within the network

    There are slight discrepancies between the output of SDXL-VAE-FP16-Fix and SDXL-VAE, but the decoded images should be close enough for most purposes.

    Benchmark from here:by Kubuxu

    https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/discussions/7

    Evaluation on COCO val-2017, 256x256, RandomCrop with padding
    Metrics:
    LPIPS:
    https://github.com/richzhang/PerceptualSimilarity/ (lower better) and structural similarity index measure via skimage.metrics (higher better)
    Metrics given as: mean [79% credibility interval]

    Description

    sd_xl_refine_fixed_vae_fp16