CivArchive
    SD1.5 Direct Preference Optimization - DPO - v1.0
    NSFW
    Preview 4809745
    Preview 4809746

    Not my model, from the huggingface repo. This is an excellent merge model, particularly in the middle blocks. Try it yourself - take your favorite model, and block merge this at about 10% input, and 20% middle, and adjust from there.

    Original U-Net: https://huggingface.co/mhdang/dpo-sd1.5-text2image-v1

    bdsqlz's release: https://huggingface.co/bdsqlsz/dpo-sd-text2image-v1-fp16

    bdsqlz released the sdxl model here: https://civarchive.com/models/237681/dpo-sdxl-fp16 but us poor 1.5 users were left in the dark ages.

    I had to do some hacking to get the fp32 version, so you will have to bring your own VAE.

    Diffusion Model Alignment Using Direct Preference Optimization

    Direct Preference Optimization (DPO) for text-to-image diffusion models is a method to align diffusion models to text human preferences by directly optimizing on human comparison data. Please check paper at Diffusion Model Alignment Using Direct Preference Optimization.

    SD1.5 model is fine-tuned from stable-diffusion-v1-5 on offline human preference data pickapic_v2.

    SDXL model is fine-tuned from stable-diffusion-xl-base-1.0 on offline human preference data pickapic_v2.

    Description

    FAQ

    Comments (11)

    513820Dec 23, 2023
    CivitAI

    How old is this? Is it any good as an output model? I don't really make merges, but also, isn't the fp32 version better for that?

    pyn
    Author
    Dec 23, 2023

    It's less than a month old. Hunting around, I did see a larger model on HF. I've updated the card, and I'll take a look at it; but it's U-Net is a bit of a mess so I might have to hack at it some.

    amazingbeautyDec 26, 2023
    CivitAI

    what this model (dpo) actually do !!? in simple words ?

    pyn
    Author
    Dec 26, 2023

    Reading the paper (https://arxiv.org/pdf/2305.18290.pdf) they take and generate some results, and then get people to say which they prefer and use those to fine-tune the model.

    pyn
    Author
    Jan 9, 2024

    Just to be more clear, the way they fine-tune the model is actually modifying the weights directly to make it more like the preferred one and less like the unpreferred one, so it's not just a SFT.

    OlbanetsDec 30, 2023
    CivitAI

    I'd like to try it as a part of my merges. Are there any tips/tricks/hints I should know before?

    pyn
    Author
    Dec 31, 2023· 1 reaction

    Make sure you get the fixed version, I noticed there was a problem with the first fp32 I uploaded. Then, use a weighted block merge. Less is more. I start with .9 for the input layer, .8 for the middle, and 1 for the output.

    @pyn is this the fixed version here?

    OlbanetsJan 3, 2024

    @omegablast20023899 it still has two extra keys, but usable

    HikariasJan 13, 2024
    CivitAI

    this can be used to train loras?

    pyn
    Author
    Jan 13, 2024

    You should train loras on whatever model you plan on using the most. This model was trained on the original 1.5, so it could be used to train loras that are meant to be similar to the base model.

    Checkpoint
    SD 1.5
    by pyn

    Details

    Downloads
    781
    Platform
    CivitAI
    Platform Status
    Available
    Created
    12/22/2023
    Updated
    6/1/2026
    Deleted
    -

    Files

    sd15DirectPreference_v10.safetensors

    Mirrors

    sd15DirectPreference_v10.safetensors

    Mirrors

    Available On (1 platform)

    Same model published on other platforms. May have additional downloads or version variants.