Anima is a 2 billion parameter text-to-image model created via a collaboration between CircleStone Labs and Comfy Org. It is focused mainly on anime concepts, characters, and styles, but is also capable of generating a wide variety of other non-photorealistic content. The model is designed for making illustrations and artistic images, and will not work well at realism.
It is trained on several million anime images and about 800k non-anime artistic images. No synthetic data was used for training. The knowledge cut-off date for the anime training data is September 2025.
NEW: Try the Turbo LoRA for better stability and much faster generations.
Versions
Anima-Base
The pretrained, unrefined base model. Maximum flexibility, diversity, and style adherence.
Anima-Turbo
Coming soon.
Installing and running
Get the text encoder and VAE from the HuggingFage page.
The model is natively supported in ComfyUI. The model files go in their respective folders inside your model directory:
anima-base-v1.0.safetensors goes in ComfyUI/models/diffusion_models
qwen_3_06b_base.safetensors goes in ComfyUI/models/text_encoders
qwen_image_vae.safetensors goes in ComfyUI/models/vae (this is the Qwen-Image VAE, you might already have it)
Generation settings
Works at resolutions between 512^2 and 1536^2 pixels.
30-50 steps, CFG 4-6.
A variety of samplers work. Some of my favorites:
er_sde: neutral style, flat colors, sharp lines. I use this as a reasonable default.
euler_a: Softer, thinner lines. Can sometimes tend towards a 2.5D look. CFG can be pushed a bit higher than other samplers without burning the image.
dpmpp_2m_sde_gpu: similar in style to er_sde but can produce more variety and be more "creative". Depending on the prompt it can get too wild sometimes.
If going for a more realistic / painterly look, the beta57 scheduler (ComfyUI RES4LYF custom node pack) can help make better textures, since it puts more emphasis on low-noise timesteps.
Prompting
The model is trained on Danbooru-style tags, natural language captions, and combinations of tags and captions.
Use lowercase for tags, and spaces instead of underscores. Score tags are the only tags that use underscores.
Recommended positive prefix: "masterpiece, best quality, score_7, safe, "
Recommended negative: "worst quality, low quality, score_1, score_2, score_3, artist name"
When using a tag that is different between Danbooru and Gelbooru, prefer the Gelbooru version.
Prompt weighting works, but needs a weight higher than typically used for SDXL. Example: "(chibi:2)"
Tag order
[quality/meta/year/safety tags] [1girl/1boy/1other etc] [character] [series] [artist] [general tags]
Within each tag section, the tags can be in arbitrary order.
Quality tags
Human score based: masterpiece, best quality, good quality, normal quality, low quality, worst quality
PonyV7 aesthetic model based: score_9, score_8, ..., score_1
You can use either the human score quality tags, the aesthetic model tags, both together, or neither. All combinations work.
Time period tags
Specific year: year 2025, year 2024, ...
Period: newest, recent, mid, early, old
Meta tags
highres, absurdres, anime screenshot, jpeg artifacts, official art, etc
Safety tags
safe, sensitive, nsfw, explicit
Artist tags
Prefix artist with @. E.g. "@big chungus". You must put @ in front of the artist. The effect will be very weak if you don't.
Full tag example
year 2025, newest, normal quality, score_5, highres, safe, 1girl, oomuro sakurako, yuru yuri, @nnn yryr, smile, brown hair, hat, solo, fur-trimmed gloves, open mouth, long hair, gift box, fang, skirt, red gloves, blunt bangs, gloves, one eye closed, shirt, brown eyes, santa costume, red hat, skin fang, twitter username, white background, holding bag, fur trim, simple background, brown skirt, bag, gift bag, looking at viewer, santa hat, ;d, red shirt, box, gift, fur-trimmed headwear, holding, red capelet, holding box, capelet
Tag dropout
The model was trained with random tag dropout. You don't need to include every single relevant tag for the image.
Dataset tags
To improve style and content diversity, the model was additionally trained on two non-anime datasets: LAION-POP (specifically the ye-pop version) and DeviantArt. Both were filtered to exclude photos. Because these datasets are qualitatively different from anime datasets, captions from them have been labeled with a "dataset tag". This occurs at the very beginning of a prompt followed by a newline. Optionally, the second line can contain either the image alt-text (ye-pop) or the title of the work (DeviantArt). Examples:
ye-pop
For Sale: Others by Arun Prem
Abstract, oil painting of three faceless, blue-skinned figures. Left: white, draped figure; center: yellow-shirted, dark-haired figure; right: red-veiled, dark-haired figure carrying another. Bold, textured colors, minimalist style.deviantart
Flame
Digital painting of a fiery dragon with glowing yellow eyes, black horns, and a long, sinuous tail, perched on a glowing, molten rock formation. The background is a gradient of dark purple to orange.Natural language prompting tips
Follow standard English capitalization rules for character and series names.
If using pure natural language, more descriptive is better. Aim for at least 2 sentences. Extremely short prompts can give unexpected results.
You can mix tags and natural language in arbitrary order.
You can put quality / artist tags at the beginning of a natural language prompt.
"masterpiece, best quality, @big chungus. An anime girl with medium-length blonde hair is..."
Name a character, then describe their basic appearance.
"Digital artwork of Fern from Sousou no Frieren, with long purple hair and purple eyes, wearing a black coat over a white dress with puffy sleeves..."
This is extra important when prompting for multiple characters. If you just list off character names with no description of appearance, the model can get confused.
Limitations
The model doesn't do realism well. This is intended. It is an anime / illustration / art focused model.
The model may generate undesired content, especially if the prompt is short or lacking details.
Avoid this by using the appropriate safety tags in the positive and negative prompts, and by writing sufficiently detailed prompts.
The model isn't great at text rendering. It can generally do single words and sometimes short phrases, but lengthy text rendering won't work well.
The base version is a true base model. It hasn't been aesthetic tuned on a curated dataset. The default style is very plain and neutral, which is especially apparent if you don't use artist or quality tags.
Finetuning tips
Don't train the LLM adapter. My own training script, diffusion-pipe, lets you set llm_adapter_lr=0 to completely disable training it, and the example config has this as a default.
Other trainers like sd-scripts have similar options that should be used.
The LLM adapter processes the text embeddings before they get to the diffusion model, and therefore has an outsized influence on the generated images. The adapter itself contains a surprising amount of knowledge and is easy to degrade by training it.
Use a low learning rate. For a rank 32 LoRA, start with 2e-5 and adjust up or down from there.
As a base model, there is no aggressive aesthetic tuning or RLHF you need to overcome when finetuning.
The model has an extremely large and diverse amount of visual concepts baked in already. A light touch is all you need.
Example of a style LoRA, with dataset and configs shared.
License
This model is licensed under the CircleStone Labs Non-Commercial License. The model and derivatives are only usable for non-commercial purposes. Additionally, this model constitutes a "Derivative Model" of Cosmos-Predict2-2B-Text2Image, and therefore is subject to the NVIDIA Open Model License Agreement insofar as it applies to Derivative Models.
If you would like a commercial license, please email [email protected]
Built on NVIDIA Cosmos.
Description
FAQ
Comments (200)
Hail to the king, baby
PSA: If you have "TAESD" set in ComfyUI preview and you get an error it might be due to having "lighttaew2_1.safetensors" file in your VAE_approx folder.
Seems like a really good upgrade over base, but maybe a side grade when compared to community models. I'm sure the community will finetune this model and make it more of a pure upgrade. So far the following seems improved:
- Image diversity (greatly improved)
- Art style knowledge
- Prompt adherence (in some cases)
Worse:
- Prompt adherence (in some cases again)
- Stability (in some cases)
I feel like maybe the training was stopped and uploaded when the model was in middle of trying to generalize in new ways maybe if that makes sense I get that kind of vibe
Obvious improvement in all ways I've tested, appreciate the work you're doing on this.
Who up circling they stone rn?
How many epochs will the final version be, and how many epochs do the current two preview versions correspond to?
Anima Style Explorer:
https://thetacursed.github.io/Anima-Style-Explorer/
(20k danbooru-tagged artist previews)
More like a content for negative prompt... Not so much good artists among all 20k. No wonder an average result of this model looks not good without extra quality tags.
It would be interesting to rate them and train on the best (the worst for "negative" training)
@Shio_N I see plenty of good artists amongst those 20k, especially if you sort by unique.
Yes there's quite a few bad ones too, but if you can't find a good amount of good styles in there, either you have a very bad taste in styles or you're delusional, take your pick.
I'm really glad this list exists and wish it did for more models.
@upscaleanon537 I mean I have seen and tested a lot up to #3800 by uniqueness. The problem after 3000 they have almost no effect on complex prompts. 85% of these artists are bad by my criteria. Also the worst of them look like a good thing for negative prompt. Honestly, negative prompt only for artists looks like a better idea as you would have a big variety in output instead of a specific style, so seeking artists for negative prompt is not a bad idea.
Damn, this is a great site. It would be perfect to integrate it into Forge Neo.
Thank you very much for your tremendous efforts!
It seems that LoRA models trained in Preview 1 can also be used in Preview 2. That's good.
Can be, but they are less precise, as base model have a lot of changes.
pony/illustrious cappacities, with qwen prompt adherence, simply amazing!
@kugelkoter867 Compared to others, you are still more disgusting
I like it, feel like its worse at hands than illustrious and some pony fine tunes which is crazy considering they're not only old in ai space, they're old in general
@Autokrator_Mechanicus it's better in hands than any illustrious I know. Try euler (default) + sgm uniform.
@Shio_N I use sa solver pece and beta57 ( occasionally beta_1_1 ) which is far better than both, not sure why people still use euler it's so bad compared to the schedulers released the past year, glad sgm uniform is getting hype nowadays though ( yes i've already done tried the recommended samplers and schedules as should anyone who installs a new checkpoint should, they were even worse )
@Autokrator_Mechanicus I have tested all schedulers. Euler works better. Maybe because it was trained on euler (not sure about this)... Euler ancestral works bad. Something like er_sde people recommending have some issues. Only euler have almost no problems.
@Autokrator_Mechanicus also sa_solver_pece you mentioned is 2x times slower than euler. Euler just wins in all directions on my tests as 40 steps euler is much better than 20 steps sa_solver_pece.
@Shio_N sa solver pece is a triple step scheduler that is why, but its calculated to only use double steps so it's the most advanced, not sure how you say euler is better as when i check your images the fingers are all terrible which is a general euler issue as well as the image being generally blurry, what sampler did you use? i admit i've only used multiple beta variations and exponential with sa solver pece so it might be awful with other samplers
@Autokrator_Mechanicus "all terrible"? There are no problems with fingers on my images. If you want better - just generate realistic. Realistic tags remove all the fingers problems most of the time together with anime style. I have tested a lot of things... But your "it's so bad compared to the schedulers released the past year" say it all. You don't want to use old not because it's bad, only because it's old.
and will final version use more than 0.6B encoder, guys?
No, using a different text encoder would destroy a lot of training progress and would make the final training time much longer for no reason. A small text encoder for a small Diffusion model. There are enough big models, this one is supposed to be lightweight.
Just try the model, it's amazing how good the natural language prompt adherence is despite the tiny 0.6B LLM TE.
@compgamer1337267 Hey, use this UI. Has native support for Anima and is very well optimized. - https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
After testing, it appears the PW2 version is inferior for academic art. It is now much more difficult to achieve academic-level aesthetics, especially while approaching realism while maintaining a stylized appearance. Figure drawing (especially in a Western style), composition, and lighting are all much more difficult to control. I now constantly have to fight the model to prevent the gens look like photo-medium/3D/ai-generated whenever I want them to look professional and academic. Could the influx of generic anime artists with poor figure drawing skills used to train the character knowledge in the dataset be the reason?
But isn't the whole point of this model to produce only anime?
Finetunes will come out soon. They will have realism ones.
@Dazrock As I wrote, it's not about realism. It's about the artists' technical skill and mastery, which the built-in aesthetic tags don't capture whatsoever. Professional-level figure drawing and composition are used everywhere, including anime and cartons with flat coloring and cell shading. Danbooru is 99.9% low-skill garbage, and the dataset ripped from it may be skewed. Not toward anime but toward poor skill. If you don't have good taste or aren't a professional artist, you might not notice, but I do. Even anime artists need to learn the basics and study academic art, but most don't. Again, it's not about fancy rendering or realism.
In theory, I should be able to raise the technicality of my gens by using low-skilled artists in a negative prompt. However, the Anima model doesn't quite allow this for some reason. Perhaps it's the architecture or overfitting; the PW1 version isn't good at this either, unlike the SDXL models I use every day.
Honestly, I've yet to see a model that wasn't horribly poisoned by low-skilled artists. I wish the people training models were more technically proficient and aware of this problem.
And no, fine-tuning will not magically solve the problem. Finetuned models always involve tradeoffs.
@popim48846589 Do you know how SDXL looked when it first got launched as a base model?
I'd say this is ALREADY a good base model, and it's not even finished yet.
We will get finetunes later which can be more aesthetically focused later.
What the hell are you even talking about?
This is a free model, and it's a preview version. You have different requirements here and there, and it's your niche requirement. With so many requirements, why don't you train a model yourself
@popim48846589 Can you give some comparison examples? Not only for this model and previous preview, but overall, i am genuinely interested in what do you mean by that (and maybe what you consider artists suitable for neg prompt).
I of course have no idea about academic art, but maybe i'll learn something from examples xD
For now i just see a model that can do some things locally much better than ill/noob ai. But lacks in details, higher res, and realism, at least for now. But i am happy that i can finally more or less efficiently run model that semi-decently can place a text on image, and follows things like "place this in left-bottom corner"
@popim48846589 i also saw some comparison in one of AnimaPreview finetunes, someone compared Aivazovsky style there in base version and in finetune, and in the latter it completely lost the "painting" part of it. But i guess you are talking about something else here.
@popim48846589 Honestly you should train stryle loras for Chroma and roll with it. Chroma + Style lora will give you what you need because of its extended realistic knowledge and ability to generalize and apply its photographic perspectives and background quality to your style lora and turns effectively your style into an academic art excersize. Its just slow and hard to prompt on the downside (best technique in chroma is to just caption images with Gemini and use those captions for vibe, atmosphere, theme etc and mixed with some booru tags)
Or, for that matter, wait for Lodestone's Flux Klein finetune with his Chroma dataset. It's small and fast and Chroma-Klein + Style LoRa will go hard. There is also a higher chance of people touching his klein chroma base for a more extended anime finetune than Chroma1-HD which is just too big to do so.
Whats special about this particular version?
I think it comes down to a matter of taste, but I personally prefer preview 2. It feels better at fine details (especially hands) and mixing styles.
if anyone ask v1 or v2 better, i choose v2
I choose Nano Banana Pro.
@kugelkoter867 no one ask for it bro :))
@kugelkoter867 you compared something as big as nano banana to a 2B model you can run on your own PC/laptop. Indeed, nano banana have better coherency, but this model have more possibilities. Not only nsfw, but it have better creativity with right settings/prompts. Nano banana love to generate very average looking images.
@kugelkoter867 what is this then? It's difficult to get something like this out of nano banana: https://civitai.com/posts/27218014
@Shio_N I'm not sure if it's just feeding a troll but I might add in two cents - models like Nano Banana Pro are bound to strict safety filtering beyond just reinforcement learning & sanatized datasets and no possibility of finetuning or providing it any more subjects than it already knows. And with Anima being a model that is half the size of SDXL, I feel it's even more developer friendly than it has ever been for people to create finetunes and LoRAs that arent just for generalized aesthetics. And the blend of caption and Booru tag learning means there is no need to write essays for a prompt, making it a very easy to use model.
I made a custom ComfyUI node called Anima-Style-Explorer. https://github.com/fulletLab/comfyui-anima-style-nodes It lets you copy prompts and styles into your prompts directly inside ComfyUI without leaving your workflow, with batch loading while scrolling, preview support, favorites/integration features, and updated media handling. If you try it, feel free to share feedback or feature suggestions.
If some concepts don't get through the tags then adding natlang is recommended, yea. Also mixing styles works nicely with natlang. Like if you want to make aesthetic changes to the way an artist style draws images you can add natlang descriptions that oppose the usual style and get nice blendings that way.
I've found a good way is to describe the image composition using natural language and then add tags for stuff like outfits or lighting. Not sure if all natural language would be better but I CBA coming up with that
Some stuff works and some doesn't when it comes to NL. Tags are most reliable but most rigid. It's hard to say what works best for every scenario especially since it's an underbaked preview. Just gotta experiment.
Yes, it's definitely worth it. The longer the prompt the better, add in lots of tags and NL.
@deitychaser Thanks, and my main concern is whether we should avoid having duplicate elements between the natural language and the TAG, or if it doesn't matter
I made anima preview 2 available for free on mobians.ai for those interested.
9 minutes per generation? Go to hell!
@Chertilo It runs off my personal gaming computer, if I could make it faster I would
@metal079 I am not sure it is viable business strategy, when you have only a single gaming PC
@metal079 I have a cheap Intel Arc B580 GPU, but even so 30 steps generation 1024x1024 is under 20 seconds. So on my GPU 9 minutes is like 27 people are trying to generate something.
@Chertilo that's rude to write this to someone who is offering you something for free.
@Shio_N Its pretty busy usually with 100+ people in the queue,for the record I have a 5090, a 4090 and another 4090 running the website usually
@Monfor_Salentaiel I recently upgraded to a second PC, right now its 1 5090 and 2 4090s running the website. Obviously not enough to meet demand though of course
You can use TorchCompile, which will speed things up.
On my B580, a 32-step run at 1216x832 just 15 seconds.
Anatomy on fingers has degraded on v2
Hands/fingers are fine for me, what workflow/prompt are you using?
@SkibidiGeorgeDroyd It might depend on artist thought, but try for example orie h artist, my friend showed my v1 and v2 comparsions, v2 made all the fingers bad. Might also depend on scheduler. on er_sde its more ovbious, on euler less obvious. Just that we dont see such behavior on v1
@Araraya Just did some gens with that style and not noticing where all the fingers are bad or anything like that. I just used er_sde with sgm uniform scheduler and 30 steps. Negative prompt matters a lot, you could try the below that I use:
worst quality, low quality, score_1, score_2, score_3.
film grain, scan artifacts, jpeg artifacts, dithering, halftone, screentone.
cropped, signature, watermark, logo, text, english text, japanese text, sound effects, speech bubble, patreon username, web address, dated, artist name.
bad hands, missing finger, bad anatomy, fused fingers, extra arms, extra legs, disembodied limb, amputee, mutation.
muscular female, abs, ribs, crazy eyes, @_@, mismatched pupils.
But sometimes its just rng whether fingers are good or not, luckily inpainting and img2img/hr fix upscaling work w/o artifacts on this version as long as you dont go too high with the resolution (although multidiffusion/tiled upscaling is the best way imo).
@SkibidiGeorgeDroyd we are back to sd 1.5. day of schizo negative prompts. Add elongated torso and elongated throat :D
@deitychaser XD
true, only annoying thing but i like the rest of the model, has super good artist coherency and the base res has better clarity than other 1024 models at the same time, not sure how they managed that but very cool
@Autokrator_Mechanicus So true. I dont even need nhentai or the sorts with this anymore
@Araraya yet anima is also pretty good with that too
@deitychaser except in this model they actually work as model can understand natural language and content better.
I have tested it on both models. v1 really have more positive outcomes, BUT v1 is trying to generate much more simple scenes when v2 is trying to do creative angles which is more difficult. I think it may be fixed after further training as it looks like model is trying to learn more rare concepts.
@Shio_N I agree on simplified outputs on v1, v2 give out more artistic details
@SkibidiGeorgeDroyd cfg?
@low_channel_1503 4 cfg
@SkibidiGeorgeDroyd thanks. do you use any of the meta tags like highres, absurdres, or do they not really help? i find sometimes the images are better and sometimes worse
@low_channel_1503 Tags like highres/absurd res I don't really use, but I'm sure they have some kind of effect. I keep "masterpiece, best quality, newest" sometimes and year tags which do work well for artists. But really if your prompt has artist tag(s), style descriptor tags/NL, etc.. you dont really need any quality tags. There's no aesthetic finetuning so the model will be super creative, that's why the images are "better or worse" since the model isn't rigid. If you don't like that, making prompts detailed and not short helps to make it more "consistent".
true hands are really horrible most of the time. especially when its a bit more of a complex shot
@SkibidiGeorgeDroyd You got any example images to post?
@AzulAuthority Recent posts on my profile.
Stop complaining dumbass people We don't get things like this every day, only every year Be grateful and Support the guys who made it
There's no point in just 'being grateful' feedback is VERY important and listing problems with the model is how creators know what could be improved.
Complaining is literally essential for making those models better.
Not saying to not support those who made the models, bless them.
Ver 2 definitely seems smarter, but also hardy to control style wise.
Wow! Big improvement! Some prompts are still unstable but not sure if is prompt or model issue. Also, any changes for increased generation speed?
THANK YOU!!!!!!!!!!!!
It looks like there are damaged images in dataset. Some outputs can generate something that looks like corrupted png (missing part of an image). Is there any tag that describes it? (for negative prompt)
Any examples? Only other model I've seen with this kind of thing is NovelAI in rare circumstances, like it's trained on stitched panning shots and missing parts of the image, or some anime sources seem to be in wrong aspect ratio. Is it like that or garbled messes?
@xenexia it may be cropped image with black borders. It may be just black rectangles on an image, it may be only one side of image have black border. Sometimes this border is not straight - it looks like an average corrupted png, but exaggerated. This is the most obvious example - just random black rectangles: https://ibb.co/3yZXpHxd
@Shio_N oh yeah I see what you mean. Also Im training a lora right now and on the 20th image or so it generated, it gave me an incomplete image, only showing like 200px of the top, and majority of the image being white. So yeah something is definitely up.
Really happy with Preview2.
From my testing and training Loras, Preview2 is a direct upgrade with better prompt adherence and overall improved details.
I like they are giving us intermediate checkpoints to work with. It's a good chance for developers to understand the model, training process for loras and full-finetunes and build toolsets & datasets now. When the full model releases, we'll have everything ready to go and tightens the timeframe between model release and community derivatives being available. The risk is some people will complain about the quality of the model and gain a certain reputation pre the final release but it's well worth it. Also they are basically distributing QA of the model by gaining community feedback
@xenexia I agree. I think the team was a bit apprehensive to release the training code but if PonyV7 has taught us anything is that early training support is important for 2D/Anime model. Proving that the model can be easily trained is important for wider adoption.
@tosermepls Yeah that's true. Also it's never to late to update the dataset during training if it's going in the wrong direction without needing to go back to step 1. Something like Pony V7 probably cost thousands or tens of thousands, and re-running training from the beginning after all that expense would be devastating. This model has a lot to like from the get-go though, especially the blend of caption and Booru tag prompts - it's what has kept me using NovelAI quite a bit and this is a very attractive alternative.
Been following this one on HF for a while. I could see this being big, and V2 is a pretty noticeable improvement in flexibility.
Dumb question: is this a literal checkpoint in your training of the model, a fine-tune of the original or another trained one one the same dataset from scratch? Also, thanks for the full output rights with the license.
This is a continuation of the anima preview with more training. You can read about the details on his huggingface page of the model.
I have been experiencing it for a few days already, states "No results found". Hopes on CivitAI to finally fix it
They said after a months 😭 💀💀 idk why
How is upscaling done with this model? I tried the regular hires fix and the final image comes out cooked
Keep the upscaling factor reasonable (~1.5x) and keep denoising moderate (0.3). Don't use latent upscale, but any pixel space upscaler of your choice. That works for me.
If you upscale too much it falls apart, but that is expect, most of the model training was done at 512x so far.
works fine for me at 1.5x and 0.6
It can do any size, just use tiled upscaling!
@yorgash Could you share how you do tiled upscaling, please.
@kitkalplus
https://civitai.com/images/124695779
@kitkalplus I made a simple workflow.
https://civitai.com/models/2478484/anima-tiled-segs-upscale?modelVersionId=2786588
@yorgash That's awesome. Left a comment
@AzulAuthority Thanks!
@yorgash Thank you!
Can someone pls do these style in this model https://x.com/PEAPEAFur
https://x.com/pegu2726
really great and promising model for anime. It adheres really well with natural language and with standard tags. Its also very flexible with characters and styles. It knows some of the artist styles to a better degree than illustrious and pony base models (without loras)
Will this be available to use on Civitai's generation UI? Asking as a mobile user.
No sorry but you can here tensor.art/models/964715998236132754
tensor.art/models/974860244985332606/Anima-Official-preview-2
Civitai would have to buy a license for that and being that the model is still in development that seems unlikely. Oh, and also update their backend, which sometimes it seems they don't know how to lol.
Can't we do the training online?
Other than style mixing becoming better, I hope later versions also improve character knowledge to be even better than NoobAI models, there are so many popular characters weirdly absent or barely recognized by Anima while many newer lesser popular characters work perfectly
Try describing the character with additional tags and especially some natural language helps alot for example by captioning an image of the character via gemini and then copy the description of its appearance.
@deitychaser It can help mildly but there's definitely something weird when Purah from TotK or Kawakami from Persona 5 are still noticeably off-model after filling their tags up but Hulkenberg from Metaphor is game-accurate with the name tag alone
@FunnyGrifter tried 1girl, kawakami sadayo, persona 5 & 1girl, purah , the legend of zelda and it looks pretty good to me.
ask him here https://huggingface.co/circlestone-labs/Anima/discussions
If you guys get any Question say it here https://huggingface.co/circlestone-labs/Anima/discussions
The level of prompt precision and the sheer range of tag recognition in this model is honestly staggering, as a base model. It picks up on nuances that most current bases struggle with. However, the anatomy—specifically the hands—still feels very "work in progress" and can be quite rough at times. Since this is an early development build, it's totally understandable. Overall, this has massive potential to become the next-gen universal base model. Really excited to see where this goes!
Apparently the full release version will be trained on higher resolution - this should hopefully resolve the finer anatomy issues as the current training resolution is quite small
I still have to git gud with the prompting but man, finally a fresh non-realistic good model, thank you.
Can't wait to see the final result :)
Hope we will get the full version soon.
Dear Illu, I am sorry I have to confess, I cheat u with Anima now. And I think it would be better for us both to break up!
Bit of an odd question but has anyone been able to use Anima to change the shape or style of a blue archive student's halo?
would love to test it, but there not being a webui version of it kinda ruins it for me and others
What do you mean? It works perfectly fine in Forge Neo, SwarmUI and Comfy.
@Big_Soda could be the version of forge im using (reforge), but even with all the files in the right spot it just doesnt work
@Rue_ It doesn't work on Reforge. You need the latest version of Neo Forge. https://github.com/Haoming02/sd-webui-forge-classic/tree/neo?tab=readme-ov-file
@Fish788 gotcha, thanks!
Could you merge this with Text Encoder? As someone who's very used to En fooocus, I can't use the model :D
I guess text encoder is not the problem. The model is just not supported by Fooocus or derivatives (I am not sure which version you are refering to). If you don't want to use ComfyUI, you can give Forge Neo a try:
https://civitai.com/articles/27039/setting-up-forge-neo-for-using-anima-with-stability-matrix
I need to ask, we will get in some point some training method like the one civitai provides so we can do our own loras?
Here is a trainer with user interface using sd-scripts for local anima training:
https://github.com/citronlegacy/citron-anima-lora-trainer-ui
Honestly, good job. It's already way better than Noob/Illustrious and it's not a big model too hard to run.
it is actually 5gb if u combine it with the vae and clip, and since the model is not yet fully release, so i think it might go beyond 6gb
@monicalucci Nope, it won't. Model size does not depend on training steps/duration. It's determined by model architecture and network dimension, which has to be set at the begin of training.
Can we get this on the generator? This model has a lot of potential. Would love to see what people generate on the platform using it to see how well it holds up.
It's getting better and better <3
Absolutely amazing model, I tried and I can't believe how good the results are. I really hope this keeps getting more attention, it is undoubtedly a huge improvement since Noob and Illustrious. Keep going!
This model seems to have fairly strict requirements for the input format. When I inputted a very long text, the generated image's visuals completely broke down—sometimes it could only be described as barely humanoid, with extremely poor results. Reducing the text to half its original length by various means made it much better. Is there any documentation or summary on this aspect?
did this thing work with Stable diffusion?
No, Anima has nothing to do with Stable Diffusion. And if you meant Stable Diffusion webui by A1111, then the answer is still no, it doesn't support it.
You can use Anima in the Forge fork called Forge Neo if you want an A1111 UI for it
how can i stabilise images? a single prompt generates completely different results, style is recognisable, but its always different
This has a potential to become next big thing imo (on Pony/Ill + Nai level).
There are problems (hands, instability, some weird connections like "This artist style doesnt work if you don't add linear hatching to prompt", no upscale), but some of them can be fixed with Ill/Nai upscale part + depth controlnet to preserve composition. Also hands are not that bad? Most of the time it works ok.
But the fact that it is a model that is fast, can be ran on 8gb and can even do text, control positioning, knows LOTS of characters and styles is amazing. All previous i tried were either slow, or too cleaned up of characters and styles, or whatever, This one balances all that.
I understand though that i use only a tiny fraction of what it is capable of xD
I am Really waiting for a final version, but even now it is quite amazing.
Oh, also it can do quite semi-realistic images with simple lora addition, so that potential is also there.
Couldn't have said it better myself.
So looking forward to the base model.
How did you get depth controlnet to work? whenever I try it it's either a jumbled mess or just doesn't work at all. (A specific preprocessor and/or model would be greatly appreciated)
@Sfdwackys no, i meant anima -> depth + ill/nai, sorry for confusion
One of pipelines i tried to use is like this:
1) Gen latent image via anima
2) Decode it, encode it with sdxl vae instead
3) Now use standard sdxl controlnet and latent upscale.
This is image has full pipeline https://civitai.com/images/125815042
Though it is originally basically this thing
https://civitai.com/models/2481616/animatosdxlilponywithfacedetailer
so you can get workflow from there
I am not sure why, but I am really struggling to get detailed faces with Anima. I am using SwarmUI and have made sure I have the VAE and T5 enabled, and I'm also using adetailer. No matter what though, the faces severely lack detail.
Anyone else have this issue or know how to fix it?
Perhaps because the model is mostly trained at 512px, so details may be lacking for small parts.
good future , but dont want to train much for now, the autor says all loras are likely be *throwable* so il just wait for future stable release!
High potential possible Illustrious replacement, and it's about time we moved away from SDXL which is 4 years old at this point.
how do you go about mixing artist tags with weighting? i tried the usual way in a few different configurations, but it seems that loras are the only way to get a consistent mixed style
Try the syntax [@Artist1,|@Artist2,|@Artist3,]
this is 10x better than any illustrious or pony bullshit. this is like gemini tier work. i will start my own propaganda campaign online about this model. needs more recognition.
bitches stop using this old ass style from 2023 using this model you got 20k style no way
Haven't used illustrious since this came out lol, truly next gen thing
after test ive seen the model dont do e621 stuff well , i think because there are not much data used, also the girls with u know what dont do well either , probably because e621 stuff too.
That's actually a very controversial topic among the Anima community on Huggingface. There isn't any E621 in the dataset and people on the Anima discussion page either really want E621 in or consider E621 to be objectively poison.
Hopefully all things with the model improve for the final release.
@Big_Soda ty for info didnt know about that
This is incredibly good and the prompt coherence is just out of the world. Very excited for the full version!
Wow! Model actually draws what I have in mind consistently and without side effects.
I wasn't able to achieve this with other 6-8 local models that I tried and was about to give up on idea of chibi anime character design with local models
Comparing to illustrious, what is the main difference? Can it fix hands more than 5 fingers?
it can generate much more complex details without needing a specific lora, and works better if combined with natural language, but it's a bit slower than sdxl finetunes, almost no need to use lora artstyle.
i was using the previous anima model, a gguf q8, that works but this one doesn't with the same vae and text encoders. i don't know what's wrong. they both seem the same file.
mat1 and mat2 must have the same dtype, but got Float and BFloat16
No idea what is wrong. maybe the file is bad? the previous version works fine on gguf but this doesn't
res_2m + beta57 also can gives some unique results.
1990s (style) has no effect at all, why? how to ?
Try 1990s \(style\)
you could try something longer and more detailed like: , (Late 80s and early 90s high-budget OVA style, retro, tactical realism, intricate details, matte cel-shading, muted industrial color palette, harsh fluorescent lighting, cinematic noir, high-fidelity hand-drawn tech:1.1)
So is this usable on any on-site generators? I'm pretty sure the answer is no but clarification would be nice.
@koloh65592999 does tensor have unrestricted generation? do i need to worry about it blocking anything i put into the prompt? (besides illegal stuff obviously)
@VYA no NSFW But is not very strict + no real people from real life
@VYA go to tensorhub. its for "experimental" generation
I love Illustrious but this is even more fantastic.
Does anyone know how to prompt a man trying to kiss a girl and she doesnt want to be kissed and face away from him? I find it hard to prompt
U probably better make or find Lora concept for hard positions
tutorial how to use this model online 🙀: just go to tensor art the end
Comparing to illustrious, Anima seems not able to generate dog's J (its p_ _is) , will the creator update this?
Do you mean an art style? With Anima, you need to put a @ before the artist's name.
@Loraman no, J = cock
It's hard because no e6 training in the model
Try the tags: "animal penis, knotted penis, dog penis". I'm not sure how good Anima is on generate it, maybe will need a lora for it.
??????
绷不住了
Really nice, I hope it gets more attention/momentum once the final version is released!
great model but has a serious issue making hands while the character is upside-down
How many characters does it know natively? i am asking because i have 3.5k loras for illustrations, and i dont want to download again for this T.T
It's unclear how many characters there are, but 20,000 artists have already been found. Maybe the strength here is in the artists, not the characters. You take a character from an old model, refinish it with a better model where the quality is higher and you don't need a heavy model.
The latest training data is from September 2025, so this model likely knows all characters up to that date. Too bad Illustrious LoRAs won't work with this model.
When character's one arm behind, the orientation of the hand is wrong. Hope new technology comes out and solve this
wait for full base chekpoint an on site lora train guide
The model seems to have trouble with many characters, especially those from before 2020, but overall it’s still pretty decent in my opinion. I combined some tags, and it surprisingly worked great.
post it here https://huggingface.co/circlestone-labs/Anima/discussions
Ready to get some sleep after a whole night of promp-ANIMA PREVIEW BASE 3???
gimmi that 💚
Just about to leave on vacation and preview3 drops! Will have to quickly grab this before the flight haha
Details
Files
anima_preview2.safetensors
Mirrors
anima-preview2.safetensors
Anima-PreviewV2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
animaOfficial_preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
animaOfficial_preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
anima-preview2.safetensors
Available On (3 platforms)
Same model published on other platforms. May have additional downloads or version variants.













