I try to pin images that stand out / i personally like every 1-3 weeks
So if i unpin yours doesnt mean that i dont like it anymore just trying to keep a fresh rotation of images without completly filling the start page with pins <3
Join the Discord for questions / sharing pictures :) : https://discord.gg/gcJqAKQ5Af
If you enjoy my models and want to support me, you can now support me on kofi
(Since my patreon got disabled like a year ago & buy me a coffee banned me 4.2026 )
🛠️ Recommended Settings:
Resolution:
best results with832x1216//960x1440//1024x1024//1024x1536CFG Scale:
5–8Sampling Method:
Euler aorDPM++ 2M KarrasSteps:
20–50
Description
Full circle back to the style of v3 with some small improvements.
FAQ
Comments (20)
V5 was probably my favorite SD model, V6 doesnt seem to deliver near as consistant result and the generations seem to take 5x-10x as long. Maybe I'm doing something wrong but I'm using the same prompt from V5 to V6 and v5 looked great and took my PC 1min to generate whereas V6 was blurry and lacking a lot of detail and lighting and took 9 minutes.
there is 100% something wrong if it takes u 9 minutes
How is this even possible if you use the same generator... V6 and V5 are both identical SDXL architecture
@Desync Thats what I was thinking but I would take the prompt from v5, randomize the seed and use the same steps and sampler and it would take so much longer. I realize its probably something on my end, im just not sure what it might be.
感觉v4是巅峰画质最清晰细节最好的一代
I wanted a model that would make nice images of Taimanin Ingrid. This one works for that end. Still doing some tweaking mostly for my tastes, but results are promising.
using this and for some reason no matter what i put in postive or negative prompt it's always generating two characters / people and I can't get it to not do that?
Can u show an example? i havent had that issue a single time
It does the same for me
V6 its really amazing
After playing with V.5 I can really say its the new peak for me, even when I wasnt sure at first. Hope you can still keep it activated if you can please :D
Aready voted, since there is 300 spots in auction again i will keep v3 and up always active for generations, if i dont mess it up again ^^
Brilliant. makes it much easier without the need of so many negative prompts
Which one of your checkpoints would you recommend for generating Western‑style comic art? V5 looks promising at first glance, but maybe you have a better suggestion based on your own experience using your checkpoints.
Also, I was looking through your prompts and something caught me off guard. You’re using a lot of tags that aren’t Danbooru tags, like “realistic lighting,” for example. Are these just random tags you picked up somewhere and decided to try out?
I’m asking because if those tags aren’t part of the checkpoint’s training data, then they’re only affecting the token weighting in the prompt. And since your checkpoints aren’t trained ones but merges, that makes things even stranger. If that’s the case, any changes in the generated image wouldn’t necessarily come from the clip/checkpoint actually understanding those tags, but simply from the shift in token weights caused by adding them.
Sorry for the text wall but i like to understand what is behind checkpoints that i use. 👍
Hey, i would prolly suggest v3 / v6, i merge my models with loras i make thats where the tags i use come from :)
tl;dr: _ isn’t literally ignored, but its contribution is tiny compared to real words. in practice the CLIP encoder treats "realistic_lighting" almost the same as "realistic lighting," and any image differences are more likely sampling noise or UNet bias than a real change in meaning.
mmm text walls. i'll bite.
youre making some incorrect assumptions about how a lot of this works. first off, just because its a merge doesnt tell you how the text encoder was changed. you can merge 0-100% and do all sorts of silly math stuff. this can let a model keep its prompting but change details like overfitting an aesthetic style. second, just because its a danbooru tag, doesnt mean its tokenized and then encoded soley as the danbooru tag. for example, realistic_lighting can still be broken out into real + istic + _ + light + ing. not having the "_" will still get you in the same vector space because the semantic difference between _ and " " can basically be nothing. however, it helps to use the "more accurate" tag since it leaves less room for ambiguity if youre going for a niche tag. it is still useful to split this because it lets the model learn "realistic" and "lighting" as separate concepts even if they are presented together with a _ between so long as there is other training data with the other iterations of "realistic *" tags and "* lighting" tags regardless if _ or a space is used. in essence, CLIP is not a lookup table and underscores aren't magic.
i guess im really bored because i made examples. i tried to use deterministic/convergent methods to demonstrate (same seed, DDIM, Karras).
top left A - underscore tag syntax
top right B – space-separated tags
bottom left C – hybrid descriptive phrases
bottom right D – mostly natural prose
aside from some minor differences, they are basically the same image despite the vastly different prompt styles.
Illustrious didn't start from scratch, so ~90% of the work was already done in the SDXL base model and then the fine tune further improves the models semantic understanding of Danbooru2023 tags. the end result? they basically taught the model that _ is almost the same as " ".
the tokenizer actually lives outside the model. example: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/raw/main/tokenizer/vocab.json
in A1111/Forge, you can run the script from "backend\huggingface\stabilityai\stable-diffusion-xl-base-1.0" to see how a prompt gets tokenized. i had chatgpt write a simple script on the prompt used above:
B:\Stable Diffusion WebUI Forge\backend\huggingface\stabilityai\stable-diffusion-xl-base-1.0>python tokenize_test.py "darling_in_the_franxx, zero_two_\(darling_in_the_franxx\), solo, 1girl, sitting crossed_legs on empty locker_room bench, looking_at_viewer, dutch_angle, dynamic_pose, Franxxsuit, red bodysuit, pilot_suit, skin_tight, z3r0tw0, pink_hair, straight_hair, horns, white_hairband, bangs, very_long_hair, green_eyes, subtle smirk, horns" None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. ============================================================ Tokenizer 1 – CLIP-L ------------------------------------------------------------ INPUT : darling_in_the_franxx, zero_two_(darling_in_the_franxx), solo, 1girl, sitting crossed_legs on empty locker_room bench, looking_at_viewer, dutch_angle, dynamic_pose, Franxxsuit, red bodysuit, pilot_suit, skin_tight, z3r0tw0, pink_hair, straight_hair, horns, white_hairband, bangs, very_long_hair, green_eyes, subtle smirk, horns TOKENS: ['darling', '_', 'in', '_', 'the', '_', 'fran', 'xx', ',', 'zero', '_', 'two', '_(', 'darling', '_', 'in', '_', 'the', '_', 'fran', 'xx', '),', 'solo', ',', '1', 'girl', ',', 'sitting', 'crossed', '_', 'legs', 'on', 'empty', 'locker', '_', 'room', 'bench', ',', 'looking', '_', 'at', '_', 'viewer', ',', 'dutch', '_', 'angle', ',', 'dynamic', '_', 'pose', ',', 'fran', 'xx', 'suit', ',', 'red', 'body', 'suit', ',', 'pilot', '_', 'suit', ',', 'skin', '_', 'tight', ',', 'z', '3', 'r', '0', 'tw', '0', ',', 'pink', '_', 'hair', ',', 'straight', '_', 'hair', ',', 'horns', ',', 'white', '_', 'hair', 'band', ',', 'bangs', ',', 'very', '_', 'long', '_', 'hair', ',', 'green', '_', 'eyes', ',', 'subtle', 'smir', 'k', ',', 'horns'] COUNT : 107 ============================================================ Tokenizer 2 – OpenCLIP-G ------------------------------------------------------------ INPUT : darling_in_the_franxx, zero_two_(darling_in_the_franxx), solo, 1girl, sitting crossed_legs on empty locker_room bench, looking_at_viewer, dutch_angle, dynamic_pose, Franxxsuit, red bodysuit, pilot_suit, skin_tight, z3r0tw0, pink_hair, straight_hair, horns, white_hairband, bangs, very_long_hair, green_eyes, subtle smirk, horns TOKENS: ['darling', '_', 'in', '_', 'the', '_', 'fran', 'xx', ',', 'zero', '_', 'two', '_(', 'darling', '_', 'in', '_', 'the', '_', 'fran', 'xx', '),', 'solo', ',', '1', 'girl', ',', 'sitting', 'crossed', '_', 'legs', 'on', 'empty', 'locker', '_', 'room', 'bench', ',', 'looking', '_', 'at', '_', 'viewer', ',', 'dutch', '_', 'angle', ',', 'dynamic', '_', 'pose', ',', 'fran', 'xx', 'suit', ',', 'red', 'body', 'suit', ',', 'pilot', '_', 'suit', ',', 'skin', '_', 'tight', ',', 'z', '3', 'r', '0', 'tw', '0', ',', 'pink', '_', 'hair', ',', 'straight', '_', 'hair', ',', 'horns', ',', 'white', '_', 'hair', 'band', ',', 'bangs', ',', 'very', '_', 'long', '_', 'hair', ',', 'green', '_', 'eyes', ',', 'subtle', 'smir', 'k', ',', 'horns'] COUNT : 107my script has issues but it demonstrates how it breaks up the tokens. basically it’s stuffing a bunch of individual pieces between regular words which ends up acting almost the same as a space. i'm still bored so lets adjust a bit.
valid nuance my script missed: it’s actually more complex because the tokenizer also tracks </w> to mark word boundaries. that means there is technically a difference between darling</w> (a finished word) and darling (which could be part of darlings or darlingly etc). this is kind of like how most LLM tokenizers don’t treat space as its own token. the space gets glued to the end of the previous word like "word " instead.
to see if that nuance actually matters in the real model and not just in token debug land, i compared the embeddings directly instead of guessing from the token printout. using the SDXL CLIP encoder, cosine similarity came out like this:
"realistic lighting" vs "realistic_lighting" → 0.95 "cat dog" vs "cat_dog" → 0.93 "cat" vs "dog" → 0.79 "cat" vs "airplane" → 0.32so the underscore versions are way closer to their space equivalents than even two related animals are to each other. token-level checks also show the underscore token has very low semantic weight, more like punctuation than an actual concept.
more nerd shit if you really care specific to illustrious and SDXL (specifically page 4 under 2.3.2 Text Encoder): https://arxiv.org/pdf/2409.19946
Which VAE is best to use for this model?
it has vae baked in dont rly need one
Why are the images generated by v6 always colorful, no matter what prompts I use?
too much cfg.
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.

