NOTE: This is an old model, my latest work is Ultra Maiden.
Description
FAQ
Comments (4)
Actually, the SD 1.5 Unet (fp32 for sure), has never been fully saturated. The same holds true for SDXL (fp16) Unets. Yep, you would have to tag >14 billion images to do that. SDXL KOs at 4K. SD 1.5 blows out at 1440p to 2K-ish.
True, SD can do that high res, but only if you train it on a lot of high res images. If I started this project today I would do that, but as it is much of the training for this model was done on measly 768 px images. So that's where it's most comfortable.
Anyway, it is what it is, and it's too late to change now, if I started over I would target XL or Flux instead, SD is getting left behind! I think I can still squeeze some more quality out of it, but there are limits. Partly due to the architecture itself, and partly due to the training that was done before. All training builds on what was learned thus far. Upgrading a model like this to handle 2k well would basically mean retraining it from scratch. Not gonna do that!
But I do wonder if it would be worth it to fine tune it in full FP32. Generators cut FP32 models down to FP16 when loading them by default, so all my tuning decisions have been done in FP16. I just maintain the models in FP32 to avoid unnecessary rounding errors to creep in, but it might be a good idea to start loading them as FP32 during testing, maybe a little better performance could be reached that way. At least for those who also load them as FP32 (which is basically no-one!).
@contrarian Most SD 1.5 models do just fine at nearly 2K with a tiled VAE. XL does even better. I enjoy 1.5 models for how extraordinarily well they do with controlnets when compared to Xl. Flux does OK if you have the ram for it. There are certain things that can only happen when passed through a 1.5 model before a touch of Flux is applied. 1.5 is unique as it shares a clip with Flux. Using clip interrogators and conditioning deltas yields results that neither 1.5 or Flux can reach otherwise. Somehow 1.5 models can represent concepts that neither XL or Flux can grasp in some images, like geometry from old games. The same applies when transferring images from Flux to XL, a little 1.5 is perfect for those hard to reach places.
@SkoomaCatJesus Interesting. Maybe SD still has more relevance than I thought!
Personally I just prefer the way it responds to prompts. It's bad at doing exactly what you specify, but it excels at drawing relevant associations from it and cooking up something creative. SD is just more intuitive and artistic, and prompting it is more an art form than engineering. Flux does what you say, but the downside is that if you don't say much it won't do much either. And it only understands basic things well, and only in the most literal way possible, whereas SD has some vague understanding about just about anything you throw at it. Give SD the name of a country or a city and it will apply the cultural aspects it associates with that place in a creative and interesting way. Give it a girl's name, and it will have some idea of what personality and appearance a girl with that name might have. This randomly biased opinion about all things is a lot of fun to exploit! I don't see XL or Flux doing this to the same extent. You need to be more explicit with them, and then they just do what you explicitly told them to do. Boring!
Anyway, the more artistic nature of SD is the second most important reason I've stuck with it so far. The most important reason is of course that it's just so much better at making sexy maidens! XL and Flux were carefully designed to be SFW, and this inherent anti-sexiness bias has to be overcome by fine tunes, whereas SD was completely uncensored from the beginning, and was just trained on whatever they could scrape from the internet, boobs and all! Awesome!
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.