Description
2023.12.20 Morning HelloWorld 3.1 DPO Version Update
This version is a minor upgrade of the v3 version. I extracted the DPO LoRA from the SDXL_DPO fine-tuned model and used this LoRA to blend and optimize the HelloWorld 3.0 version. Compared to the original 3.0 version, the new version slightly improves character skin tone and attractiveness, with other changes falling into the realm of mystical updates. If quantified, the overall improvement is around 3% (no Intel's name, got Intel's disease). It is recommended to keep either the 3.1 DPO version or the 3.0 version.
Direct Preference Optimization (DPO) is a method that aligns textual-image models with human preferences. The SDXL_DPO model uses the 851K crowdsourced paired Pick-a-Pic dataset, and fine-tunes the SDXL official base model based on Diffusion-DPO technology. In real human preference tests, the model has a PK winning rate of around 70% against the SDXL official base model, demonstrating better visual appeal.



