✨ UWU_XL: V.2 ✨
(This model is crafted with passion, merging unique aesthetics to bring your visions to life. It's built to impress, designed for creators who demand the extraordinary.)
🌟 WHAT'S NEW IN V.2?🌟
+ Realism! (dpmpp_2m_sde_gpu seems to be best for realism)
+ Enhanced Img2Img Support!
+ Seamless Natural Language Integration!
+ More Custom Characters!
+ Unique Stylistic Horizons! (Euler_Ancestral can guide us towards stunning 2D+3D fusions)
+ Illustrious & Pony LoRA Support! (Work in Progress – some connections are stronger than others.)
⚠️ IMPORTANT!⚠️
This version fixes a lot of broken things in V1 but also suffers from keeping generations with very similar elements. If you don't change up how you prompt something, you'll tend to get very similar results (which can be good and bad).
💡OPTIMIZING YOUR GENERATIONS💡
CFG Guidance:
To remedy consistency, adjust your CFG occasionally.
* 3-7 is generally its sweet spot.
* 3-4 for initial generations.
* 5-7 for upscaling or refiners.
Prompting Strategy:
Tags work wonderfully, but try things that aren't the usual. The more unique the better. During training, I used a combination of natural language and tags, applying three methods: AI-generated descriptions, traditional tagging, and describing scenes in my own words.
Structure your prompts for optimal results:
1. Start with your subject description.
2. Describe the scene or background.
3. Finally, apply any specific tags.
Avoiding the same person/people:
1. Add an ethnicity.
2. Describe the face "X shaped eyes, narrow nose, etc."
3. Avoid vague descriptions "Handsome, Pretty"
4. Negatives come in clutch here, try adding characters or names. This creates a strong negative bias and the model will push towards different looking people.
😈 CONTENT & CAPABILITIES 😈
* This model can and will generate nudity. If you don't specify the subject's attire, they will default to wearing nothing.
* Explicit scenes are something this model does struggle with; it works sometimes but not reliably.
* Solo subjects work really well, duo encounters are iffy.
* I strongly encourage using your favorite LoRAs to depict any acts you wish to see. (However, if you're shooting for realism, things might not go as expected; if you use a stylized LoRA, stick to a stylized prompt, and vice versa.)
🧬 MODEL DNA & TRAINING (UWU_XL's Core) 🧬
This model is named UWU_XL because it was heavily trained on a diverse dataset of goth variants (pastel, cyber, etc.) and egirl aesthetics, among others.
The dataset contains:
* Publicly available images.
* My personal generations.
* My own photographs.
* My own art.
This model also features base model merges of SDXL, Pony, and Illustrious, leveraging their unique tags and dataset dimensions for enhanced results.
💖 SUGGESTIONS & FEEDBACK💖
If you'd like to suggest styles or datasets, please message me, and I will try to add features or subjects to the dataset.
✨ V.1 (The Foundation)✨
My first XL checkpoint, used for all my initial generations. When paired with my LoRAs at low strengths, it generally yields something resembling my stylistic examples.
It can do both realistic and semi-realistic generations, but by default might yield Pony-like results (it was trained on real and animated images). Due to this, some tags will lean certain directions. Natural language and tags both seem to work well in most cases.
V.1 Realism Tags (Still Useful for V.2!)
in the year 1995, analog film, kodak snapshot, candid photography, high resolution, 100mp, filmic, natural face, proportional facial features, imperfect skin, low contrast, modern color grade, aged, film grain, unique angle, instagram, profile pic, screen grab, captured in 4k, photo of (insert subject), (fake name of a person), technicolor, digital cinema, movie-like
Description
*Better Realism
*Enhanced img2img support
*Natural Language support
*More Custom Characters
*Unique Styles
*Support for Illustrious & Pony Loras (WiP)
This version fixes a lot of broken things in my v1 model but also suffers from keeping generations with very similar elements. If you don't change up how you prompt something you'll tend to get very similar results (Which can be good and bad)
In order to remedy this I would change your CFG occasionally. 3-7 generally works well. I would use 3-4 for generations and 5-7 for any upscaling or refiners.
Tags work well, but I would try things that aren't the usual. The more unique the better. During training I used a combination of natural language and tags. 3 methods were applied, Ai generated descriptions, traditional tagging and then describing the scenes in my own words.
Start with your subject description, describe the scene or background and then finally apply any tags.
This model can and will generate nudity, if you don't specify and subjects attire they will default to wearing nothing. Explicit scenes are something this model does struggle with, it works sometimes but not reliably. Solo subjects work really well, duo encounters are ify. I would strongly encourage using your favorite loras to depict any acts you wish to depict. (However uf your shooting for realism things might not go as expected, if you use a stylized lora stick to a stylized prompt and vice versa.)
This dataset contains publicly available images and my own generations and photographs i've taken, as well as my own art.
This model also features base model merges of SDXL, Pony and Illustrious.
If you'd like to suggest styles or datasets please message me and I will try to add features or subjects to the dataset.
FAQ
Comments (8)
So the block on UK users using CivitAI will come into effect today, so I thought I'd get some more images posted before I'm unable to sign in any more.
I've enjoyed using your model. It's produced some great results with little to no refinement needed. Wish I was able to give you some buzz, but I doubt I'll be able to buy any even if I am able to find a way to still sign in. This Online Safety Act sucks.
I'm glad you like it! That really means a lot to me! I'm working on a new helper LoRa that should help with clarity and styles. It should also help with prompt adherence and getting people to do really specific things without the need for any controlnet models etc. If you end up being able to get back on hopefully it can make the model even more enjoyable to use!
If I don't end up hearing from you again, just know I really appreciate the feedback!
Uwu v1 really has a specific use-case where it excels: you can use it as what I'd call a "big refiner" (like 50%) on initial generation (or as some sort of 2nd, pre-upscale pass in Comfy) and it adds a LOT of good stuff to an image. Alone in generation it's not quite so spectacular, and it just straight up doesn't work as a detailer or in a hires upscale pass. But that one niche function it excels in is REALLY awesome.
v1 has its quirks for sure, you can use it in a lot of different ways but overall I wasn't happy with its adherence. It more or less got me in the style I wanted ballpark. My recent version was much closer and the version I currently use (Unreleased) performs much better across the board. I'm working out some kinks, but my next release should be capable of much more. Hopefully it'll be one you use over some of the more popular base models!
@Destinyfaux I hope you continue working on this series, it is certainly unique. Another spot where I've found both v1 and v2.4 excel is with the body shape of subjects when the image is in a portrait orientation. Most SDXL results in subjects that are frequently super-tall or super-lanky or have a stretched torso, but your model here tends to pull back in the direction of a more natural, shorter, almost stockier build. Many Pony or Illustrious models can do that (think of the danbooru tag "shortstack") but it's pretty rare to find a base SDXL checkpoint that can produce vertically-challenged subjects.
@shapeshifter83 I have a newer versions (v2.5 and v2.8), that I am generally pretty happy with and cycle between them quite often. However I've lost some styles, particularly anime style images and they require heavy-duty negatives and lots of prompting to even get close to an anime look. The goal is a generalist model that does illustrious+pony styles, photo realism and anime. I train almost exclusively on portrait resolutions. I am slowly getting the anime to prompt in much easier, and am also focusing on wide screen resolutions.
I am currently creating a z-image lora based on the images from my UwU model and creating a new synthetic dataset to stay competitive with the z-image model.
I've had to retag all of my images with a more manual approach to try to retain styles. I've added anime_girl, illustrious_girl, real_girl (vise versa) along with anime_background, illustrious_background etc and some style tags. In the style of pony sdxl in the style of illustrious sdxl and now in the style a z_image.
additionally i've added tons of NSFW positions and concepts, although right now the models really tend to lean heavy into them and some times gets confused.
I am separating the text encoders and training things separately now which requires a new workflow. Instead of adding the same prompt into G and L, you get cleaner results by prompting separately. Basic concepts into L and actions + complex prompts into G.
@Destinyfaux yea when I use comfy and I'm trying to be super serious, I use the two prompt method with SDXL. I'm surprised the separate prompting of the two layers wasn't the default method in SDXL to begin with. It just gives so much more control and most people don't even know it's a thing.
i'm waiting with the z-image to see how big the non-distilled model is, but it'll probably be out of my VRAM capability. i've got the z-image turbo stuff downloaded but I haven't touched it yet. Pretty sure my 16GB VRAM can do the turbo model, at least.
But anyway, most of the time lately i'm just casually playing around with SDXL using ForgeUI, and your Uwu models are both in my rotation. They really are quite unique. You planning to upload these v2.5 and v2.8 models?
@shapeshifter83 I've been thinking of uploading my more recent versions, but I've been chasing a vision I haven't quite captured yet. I'm waiting to finish my v3 since it'll be a big jump in quality and adherence to prompts. I'll probably upload my Z-Image Turbo LoRa first in the meantime. The plan is to use my new LoRa to make new data to train my SDXL model on. That way I can re-train the text encoder 1 with natural language and then train the other text encoder on tags.
Yeah idk why really either, you get way better results. I guess it's just more manual and involved.
For sure, I have a RTX 3080 12GB card and it runs fine, I also have a NVIDIA QUADRO P1000 4GB card that I use to load controlnet models, upscalers and text encoders. Lets me throw more stuff into my workflow without running into OOM errors.
When I upload v3 I'll send you a PM. In the mean time hopefully my Z-Image Lora can hold you over and let you mess around with a new model! If you like my UwU models you can expect similar results from from this new LoRa. I'm 9k steps in out of 10k. Dataset is about 3k images (Same one I used to train UwU)