Anime base model trained from SDXL-base with dataset on 1.8M anime pictures. Cute, smart, flexible, yours!
Yes, this is a new SDXL anime base model
Outperforms every other non-pony anime models in anatomy
Outperforms Pony and NAI3 in terms of general knowledge and sfw
8k+ artists styles (wildcard) few general styles out of box
Full color palette, full brightness range, great base aesthetics
Knowledge from original SDXL, no lobotomy
Unique experience that you have been missing (probably)
Since I've got some gpu hours and decent dataset, it becomes interesting whether it's possible to train anime model that will have vast knowledge, especially about sfw/nsfw anime concepts, and at the same time prevent it from lobotomizing everything from SDXL like we've seen before in pony and others. This checkpoint is actually the answer and proof of concept. It appears to be quite experimental and a lot of things should be done or fixed, but already usable, fine in many ways and have features missing in open source checkpoints.
Tofu have (almost) same dataset as 4th tail, that allows to generate popular characters, mimic to artists styles and recognize the majority of booru tags and concepts. All the same features with natural text mixed captioning and unique training techniques here.
Small details like fingers are nice. Backgrounds with popular real world locations (comes from SDXL-base) or just pretty landscape/cityscape are available.
Posing and nsfw are okay,
do not expect it to be as good as ponywell, comparing with vanilla pony it's actually not much worse, but best PD tunes/mixes are better. Still Tofu surpasses anything else and should satisfy most. If you are looking for something more spicy - use 4th tail. Transition is close to seamless.Styles looks good, better then with pony base, and there are no issues or conflicts with broken TE.
Yes, it can generate text, but performance is very weak, especially in comparison with SD3/FLUX, just like SDXL-base. At least something.
It is compatible with most of SDXL loras and some animagine/other checkpoints loras, but it varies. Loras from pony -
no waysome style or concept loras may work, but performance varies. The most important one - controllnet from SDXL works fine. Anytest (with suffix AM, not PD) also gives decent results.
Features and prompting:
Basic:
Same as for all SDXL, ~1 megapixel for txt2img, any AR with resolution multiple of 64 (1024x1024, 1152x, 1216x832,...). Euler_a and CFG 4..9 (6-7 is best). Highresfix: anyGAN/DAT, x1.5-1.6, denoise 0.5, upscale works best with single tile resolution no more then 3mpx. Highres fix and further upscale will significantly improve quality, details, eyes, hands, feet, etc.
Set Emphasis: No norm in settings of your generation tool if you getting strange blobs or distortion.
If LCM/PCM accelerators applied - use Euler/Euler a samplers, DDIM gives a lot of mess and abominations.
No Clip Skip, just forget this meme.
Use external SDXL vae, like fp16-fix, vae baked in model may be outdated.
Quality classification:
masterpiece, best qualityfor positive
low quality, worst qualityfor negative. That's all.
No bs like score_x, source_x and others, don't put it in prompt, all you will get is just text with it on picture.
Negative prompt:
(worst quality, low quality:1.1), error, bad hands, watermark, distortedcorrect according to your preferences, just keep it as clean as possible.
Do not put tags like greyscale, monochrome, yellow background in negative, this is not a pony and you will get only oversaturated burned images.
To improve backgrounds, add to negative
simple background, blurry background, abstract backgroundbut do not forget to remove it if you are prompting something with simple.
Artist styles:
Grids with examples
Used with "by ", multiple gives very interesting results, can be controlled with prompt weights.
by ARTISTNAME1, [by ARTISTNAME2, (by ARTISTNAME3:0.8),...]or/and
[by ARTISTNAME1|by ARTISTNAME2|by ARTISTNAME3|...]Works best in the very beginning of prompt. Can be used as a wildcard. For majority highresfix/upscale improves quality and recognizability a lot.
General styles:
2.5d, bold line, smooth shading, flat colors, minimalistic, cgi, digital painting, ink style, oil style, pastel stylecan be used in combinations (with artists too), with weights, both in positive and negative prompts. More will be added in future.
Natural text:
Use it in combination with booru tags, works great. Use only natural text after typing styles and quality tags. Use just booru tags and forget about it, it's all up to you.
Unlike pony, this will be more functional here, IRL concepts, cars and mechanisms, other references - yes. But don't expect it to be close to FLUX, size and architecture are incomparable.
Tail/Ears-related concepts:
Well, it works, kind of, but not as good as should be.
tail censor, holding own tail, hugging own tail, holding another's tail, tail grab, tail raised, tail down, ears down, hand on own ear, tail around own leg, tail around penis, tail through clothes, tail under clothes, lifted by tail, tail biting, ...Brightness/contrast:
You can just prompt with tags or natural text what you want in it should work, like dark night, dusk, bright sun, etc. Black/white background works, but often it gives not 0,0,0 or 255,255,255 like should. Most of this is related to prompts - just check what pictures on booru are tagged with it.
Fortunately, using natural phrases like (cute girl in front of completely black background) fixes it. Anyway you shouldn't meet any issues with general use, it works just like NAI3, often even better.
Known issues:
Struggles in complex poses and scenes, more training is needed
Biases may be present
Ciloranko is actually opossum LMAO (error in on of cherry-picked dataset)
To be discovered, WIP, very experimental, first of a kind, etc.
Requests for artists/characters in future models are open. If you find artist/character/concept that perform weak, inaccurate or has strong watermark - please report, will add them explicitly. Follow for a new versions.
Leave your feedback, it's very valuebla and important.
License:
He he~
Since no horses were harmed, it's same as in original SDXL. Derriatives, commercials, whatever (some limitation check original text and don't break laws of your country). Just don't claim your authorship on base, it's very recognizable.
Thanks:
Artists wish to remain anonymous for sharing private works; Soviet Cat - GPU sponsoring; Sv1. - llm access, captioning, code; K. - training code; Bakariso - datasets, testing, advices, insides; NeuroSenko - donations, testing, code; dga, Fi., ello - donations; other fellow brothers that helped. Love you so much ❤️.
And off course everyone who made feedback and requests, it's really valueble.
Donate
AI is my hobby, I'm wasting money on it and not begging for donations. If you want to support - share my models, leave feedback, make a cute picture with kemonomimi-girl. And of course, support original artists.
However your money will accelerate further training and researches
(Just keep in mind that I can waste it on alcohol or cosplay girls)
BTC: bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c
ETH/USDT(e): 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db
if you can offer gpu-time (a100+) - PM.
Description
First release
FAQ
Comments (26)
Based
i know its a new base you trained but artiwaifu is already a good base, train on it this model can probably surpass pony.
Well, artiwaifu is a nice model and I wish it would becomes even better in future. But, current version is quite unstable, struggles with anatomy and already forgotten some data from vanilla sdxl (or may be it's all just because it was taken from the middle of training process without smoothing, it's too early to judge at current state).
And, of course, it’s not polite to take away from the original author opportunity to complete his work and get great results on his own. I can afford some assistance or collaboration if asked.
Based and funny)
What would be the differences between using 4th tail, or this model?
4th tail is trained from pony diffusion 6, Tofu from SDXL-base. Although they can be considered cousins, their behaviour and capabilities are different. 4th tail (like most pony derivatives) shines in anatomy and NSFW yet quite poor for complex sfw. Tofu is good for more general things, has a better working styles and at the same time okay anatomy and nsfw knowledge.
@Minthybasis Ohhh, I seee, makes sense! I hadn't noticed the different base model! Thank you for the explanations!
Yo! I think the model shipped with the broken VAE (the one that gives weird scanline-like artifacting).
It should use standard SDXL VAE like any other model, could you please show the example?
Dm'ed you. Site of course went down right as i did lol.
Strong attempt, please keep it up!
I prefer this one over 4th Tail in spite of the anatomy simply because the dataset seems less diluted by Pony (in that style and character tags seem stronger here) , but this looks like it has potential so I'm looking forward to version 2.
If you want some datasets as well let me know.
You just can't stop outdoing yourself, can't you? For an SDXL fine tune into anime, it works very good, I wonder how much good will it get with more GPU hours. I've tried lots of SDXL models (base Pony, Kohaku-XL, autismmix, AnythingXL, AnyorangemixxlAnything, and many more) yet I always come back to your models, none get closer to yours: great to prompt without artist tags, great base art, tails work very good.
As a personal request, I would like to see:
- hair_on_horn (Shimanto and Aegir from Azur Lane are great examples)
- long_bangs + hair_between_eyes (Manhattan Cafe, Implacable/Kearsarge/L' Audacieux from Azur Lane)
- sidelocks_tied_back (Yang Guifei from F/GO, Springfield from Girls' Frontline)
Oh I see you're a man of culture~ Good concepts, will add them to direct list.
@Minthybasis Thanks a lot, and do keep your models coming, if you need someone to help you test epochs, you can count with me, I always do lots of gens per day with lots of different concepts, so I can help you test limits/broken things/etc. Again, excellent work, and if you could, I would like to see these new concepts on your 4th tail model, I have some other concepts I would like, but I am asking for things without even contributing at least a small dataset.
@Minthybasis Could you tell me if "dark-skinned_female" works correctly? It seems the model has a bias for light skin tones, I have to put "white_skin" on negative for it to work 90% of the time
@blackfuture82729 You are right, a noticeable bias here. It can generate proper ones if a character/style are prompted or with negatives, but with simple tag - only slightly darker tone. Probably this is because the tag is averaging pictures like this and it mixes with base bright style. Will try to improve.
@Minthybasis Good! Again, bad base prompts coming from autotaggers, which are trained on danbooru images that already lack a proper tag for the different types of dark_skin, the same happened with braids until they fixed their tags for that. Thanks for reading my comments and improving your model!
is your dataset consisting of both uncensored/censored art? and if the model was trained on those tags?
Yes there are both censored and uncensored images with corresponding tags. By default it should generate uncensored art, but in rare cases with some artists/concepts censors may appear. In such case just add unwanted tag in negative prompt. Better not to left them for permanent use because they will add significant bias to nsfw.
@Minthybasis I see. Thank you
As someone who's been using 5th Tail (yeah, not 4th) hoping for updates someday, this is really nice.
Pointing at a single artist with so many present seems a bit silly, but I've noticed that "kaedeko \(kaedelic\)" is biased towards their older art style, rather than the current one.
Not that I'd know what to do against that apart from manual intervention (and then it's not like the current state is wrong). Even Animagine-like year tag ranges probably wouldn't be able to steer it towards current in this case as older style still falls into the edge of the "current" range used there.
Art style seems to be somewhat random if nothing is specified (minor variance on seeds, but strong with different prompts), but with plenty to choose from I guess this isn't exactly an issue.
Since requests are open, here's a few artists that would be neat imo: inuarashi, konnyaku \(kk-monmon\), yukiu con, sl8-all, ponsuke \(pon00000\), fujisaka lyric, haga yui, hepari, unagiman, mankai kaika, nakta
As the current list covers many bases already, this is just a bunch I was wondering if they were on the list and found they weren't (konnyaku had a takedown request on danbooru, hope that's not your only source).
As for characters, fluffy ones seem well covered. Personally would've liked the Love Live cast to be recognized better, especially Nijigasaki (5th Tail does better here).
Otherwise I found it to perform very well (there's always limits ofc but works as described). Thanks and keep up the good work!
As for artist watermarks, so far I've noticed signatures appearing with "by unsfrau". They do have a faint signature on basically every of their images so it's not surprising.
"signature" and "artist name" is tagged on them on the boorus, but negative prompting that into oblivion doesn't seem to work.
Great, will add them to the dataset. Thanks for the feedback, it's appreciated.
eh getting scan lines and blobs in background, even with external vae and right settings/resolution. style and quality look great of the character, but that background shit just ruins it. bleh.
Could you share an example of bad gens?



















