first time trying my hand at preference optimization RL... this one was trained at 25 samples (i tried to get more, it's painful to pick by hand). should have slightly improved , text and aesthetics. will implement proper DPO and make a new version later, for now this is the best i could do.
works with flash as well
Description
Details
Downloads
1
Platform
SeaArt
Platform Status
Available
Created
6/22/2026
Updated
6/22/2026
Deleted
-
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.
