This is an experimental checkpoint that combines some of the best realistic Illustrious models with bigASP 2.5. It worked.
BTW, v2.4 is now available here! Give it a try!
❤️ If you like Snakebite, you can help offset the cost of training:
Why it matters
bigASP has great prompt adherence but it is wildly inconsistent as far as style and composition. It feels like a base model with untapped potential. Dialing in the right settings is like trying to solve a Rubik's Cube.
Illustrious models, on the other hand, lose many concepts when going 3D. They're hit-or-miss even with fairly popular booru tags. But the lighting and composition of these weights are still 👌
I wanted to see if we can get the best of both worlds, and it turns out we kinda can! Careful block merging is the key. We can inject bigASP's output_blocks.0 to acquire much of its conceptual knowledge. Adding its middle_block.2 seems to reduce anatomical issues (otherwise, you'll get a lot of extra arms and fingers.)
First impressions
Best prompt adherence for a realistic SDXL model I've ever seen. And no, not just for smut. (But especially for smut)
Compatible with booru tags as well as natural language. I think a mixed prompt approach is best: I use about 70% tags (no underscores) and 30% natural language in a prompt.
Understands bigASP's style tags to an extent, including stuff like masterpiece quality and 35mm.
It *feels* like something new and worth exploring. Don't sleep on bigASP!
Drawbacks
Illustrious 2.0 models support resolutions up to 1024x1440 or 1024x1536 without horrific stretching of anatomy (e.g. longcat-type torsos), but bigASP's optimal resolution is only 832x1216... and I don't recommend going above that in Snakebite. If you do, the anatomy will be mostly okay (which is surprising), but image composition becomes very odd and unpleasant.
Since we're in a strange new latent space, your existing LoRAs won't work very well. But they're worth retraining.
Recommended Settings
The Turbo variant is better for inference. It's super fast and has slightly improved aesthetics. The non-turbo version is useful for finetuning, and it can produce nice textures if you don't mind waiting 25+ steps.
Turbo
8 or 9 steps
LCM sampler
CFG 1
Custom sigmas below, or simple
Full
20 to 28 steps
Euler ancestral sampler
CFG 3 to 4
Custom sigmas below, or simple
Custom sigma curve (you can use comfyui-kjnodes to apply):
15, 8, 4, 2, 2, 1, 0.4, 0.2, 0
If you're getting mangled limbs, you can often salvage the image by adjusting the first few values of your sigma curve. Here's one that is more stable for certain prompts:
14, 5, 2, 2, 2, 1, 0.4, 0.2, 0If you're still getting body horror, you can try the following quality tags (you'll need ComfyUI-ppm to apply negative weights to your positive prompt):
masterpiece quality, realistic photo, (worst quality,:-1) (mutated,:-1)Snakebite is very responsive to stylistic terms, especially by IL standards. Keep the extra "fluff" to a minimum - almost every token I've tried has a significant impact on the picture.
Finally, I suggest trying the CLIPAttentionMultiply node. If you boost the q and v parameters, it will effectively cause your image to become more "Illustrious-like": cleaner, more stable, but less realistic and (usually) less adherent to the prompt. Set both values to 3 for a very clean image.
BIGASP'S CLIP IS NOW IN PLAY!
In versions 1.3 and up, Snakebite includes a little of bigASP's CLIP, which means you can take advantage of more style prompts. Experiment with different terms to see what works. Personally, I keep it simple - this will usually improve your image without any side effects:
high quality, sharp focusWhich version is for me?
If you're wondering which version of the model to use, here's a TL;DR:
v1.4 = next-level realism, jaw-dropping textures, very stable, slightly less vibrant than previous versions and less capable of non-photographic images
v1.3 = good anatomy, good backgrounds, good coherence
v1.2 = best punchy colors
v1.1 = most influence from bigASP (excluding CLIP), dull colors, a failed experiment TBH
v1.0 = impressively creative but very unstable
If you like the model or use it for further finetuning, please let me know! I'd love to see the results. 💪
Description
Full version of 1.2 with acceleration techniques disabled. Still uses a little DPO to help resolve clashes between two very different latent worlds (bigASP and IL.)
Useful for finetuning. Not recommended for inference. While it's true that accelerators like DMD2 harm a model's "creativity" to an extent, I find that the full version is honestly too creative for its own good - the accelerators improve the success rate of coherent images.
Recommended settings:
Sampler: Euler ancestral
Scheduler: Beta
Steps: 27
CFG: 3.5
FAQ
Comments (26)
Any way to run this on any of the A1111/Forge based UIs? Or will it require some tweaks that are only available in Comfy nodes?
I think it should work fine in A1111. No special loaders are required (although it helps if you can edit your sigma curve.) Are you getting an error in those apps?
@liftweights Did a couple tests in reForge, and getting very similar results to your example images, without trying to fiddle with the sigmas, or anything - so seems fine, after all. I recall doing a couple tests with bigASP v2.5 and getting garbage, but maybe I messed something up, then.
@gonanliek64 bigASP 2.5 was trained on "Flow Matching," and it does require a special node to use in ComfyUI (ModelSamplingSD3.) I'm not sure if bigASP is properly supported in other UIs. But Snakebite avoids this requirement due to the blocks I've selected to merge.
Can share a workflow to us, for test
I like the output, it's creative by itself. However, it'll probably benefit by using a different model for upscaling and img2img refinement. My initial batch will be only this model, then I'll generate another using a different model for the img2img part.
Thanks for submitting so many great pictures! 🙂 These will help me figure out where Snakebite's weaknesses are. Looks like weapons are pretty scuffed. Then again, maybe that applies to every SDXL model. I'll have to check if vanilla bigASP renders them any better...
Very interesting model. I expected adding bigAsp into output_0 would hurt character knowledge from Illus but you managed to preserve it. The CLIP seems to be mostly(entirely?) Illus? Quite possibly the most realistic Illus model I've seen. BigAsp always looked too intimidating to me, but this model is very coherent and easy to use!
Thanks! Yes, the CLIP is entirely Illustrious. It is definitely surprising that we can retain so many concepts/characters/perspectives from each architecture... especially considering one was trained on FlowMatch and the other was not.
BTW, you can swap out the CLIP with bigASP's if you want to push the realism even further. But it was less stable and less aesthetically pleasing in my tests. I've yet to find a CLIP merge that surpasses the IL CLIP alone.
@liftweights @liftweights
TLDR: the noob clip...
Long story on how I came to the conclusion.
for a personal experiment, I wanted to train a stubborn realistic lora for illustrious checkpoints I get from civit.
I did not want to redo my dataset and thought of ways to make the model actually learn from base realistic noise. so I laid out all the base models I thought I could use.
IL, nai, Pony, SDXL1.0.
pony as a nonstarter for clip but Superior for realistic loras (for whatever reason I cant explain but it is for me)
IL and noob for CLIP - These as the base were bad to learn photos - understandable their blocks are filled with anime anyway.
and SDXL - last resort for extra realistic noise.
illustrious clip worked well with pony and sdxl when I swapped the blocks. injected 100% clip and first in and last out blocks from pony and sdxl
but it wasnt right. alot of detail loss.
the loras worked with checkpoints like cyberillustrious but failed on things like wai. but detail loss too much.
then I thought, lets do noob. noob faired muuuuch better and prompt adherence with the noob clip worked better. idk the difference is, but it works. its just that during training, you sometimes dont even need 10 epochs for a lora. 4-5 works. minimal detail loss. but can overtrain quick, even with LR 0.0002/0.00002
noob clip merged with pony, sdxl, and believe it or not, illustrious, makes all the other models better - atleast for lora training. since im not using base models for inference
suffice to say theres still things I noticed work better, like doing a ridiculous 1 epoch 800 step undertrain with the same dataset which takes like 10-15 minutes anyway on sdxl for realistic and animagine xl (not even noob or illustrious) for anime, merge it with my noob clip pony lora and it works better on illustrious 2.0 based finetune anime checkpoints. beats me why. but it works.
@A_rdatyaksh_I did you just grab the clip from the base Noob epsilon or a specific checkpoint?
@nickname45 Epsilon noob base. thought it would be the cleanest to start merging up from. to avoid incompatibilities with the other sdxl based unets
@liftweights I think 1 of the LoRAs incorporated in 1.4 might have been overtrained, one face seems to be kinda common. Resolution seems to have been improved tho
Can we please get non turbo versions of newer checkpoints? DMD2 reduces the creativity of the model and, if needed, it still can be used as a lora.
EDIT: Also, something interesting you can use a DMD2 lora on turbo model at -1 strength to get to cancel that effect.
😧 COOOOLLL
*the Turbo with a negative strength like -1, then may use dmd2 with strength 1 if need, dmd2 isn't Turbo
"DMD2 reduces the creativity" Doesn't, just forces you to use LCM Karras, nothing else if it's compatible with a checkpoint
Sure thing - I'm preparing the full version of 1.2 now. Should be up in a couple hours.
For what it's worth, the Turbo version uses a few acceleration techniques injected into specific blocks, so the hit to creativity is not as bad as activating DMD2 at 100% strength. 🙂
@liftweights I tried it on a model I liked but was having difficulty with for freedom of creativity, had to use the lora at strength -0.9, if I knew this sooner I wouldn't have skipped so many cool looking models.
ps: to whoever gave me a thumbs down, I was literally amazed - and I wish i knew this sooner. here's it back at you👎
Acceleration does reduce creativity; it basically throws it out of the window. I used the tip from this thread to subtract the DMD2 LoRA from the checkpoint, and got infinitely more variety, specifically in faces, using regular DPM++ samplers instead of LCM. I then tried the same trick on the MoP checkpoint (only available as DMD2), and that also worked: much greater variability in styles and faces. So, DMD2 is... well, I don't really know why anyone would use it unless they are extremely GPU constrained.
@civit77899 thats the same checkpoint I used. also, using dmd loras reduces the ability to prompt. your prompt has to be shorter - and here I am complaining about short prompts - its unpredictable. especially with me, I have chants https://github.com/DominikDoom/a1111-sd-webui-tagcomplete. lines of tags with weights presaved which give me what I want when I want. using even the lowest weight chants gives overcooked results with dmd, even if cfg is like 0.75
@A_rdatyaksh_I That's because CFG=1 doesn't support weighs and negs.
@civit77899 yes, but i remove my weights for cfg 1(lowest weight chants is usually just word tokens). even still, adding compounded tags, or extra context words overcooks
Remove the lowres dupe https://civitai.com/images/106372314

