Anima is a 2 billion parameter text-to-image model created via a collaboration between CircleStone Labs and Comfy Org. It is focused mainly on anime concepts, characters, and styles, but is also capable of generating a wide variety of other non-photorealistic content. The model is designed for making illustrations and artistic images, and will not work well at realism.
It is trained on several million anime images and about 800k non-anime artistic images. No synthetic data was used for training. The knowledge cut-off date for the anime training data is September 2025.
NEW: Try the Turbo LoRA for better stability and much faster generations.
Versions
Anima-Base
The pretrained, unrefined base model. Maximum flexibility, diversity, and style adherence.
Anima-Turbo
Coming soon.
Installing and running
Get the text encoder and VAE from the HuggingFage page.
The model is natively supported in ComfyUI. The model files go in their respective folders inside your model directory:
anima-base-v1.0.safetensors goes in ComfyUI/models/diffusion_models
qwen_3_06b_base.safetensors goes in ComfyUI/models/text_encoders
qwen_image_vae.safetensors goes in ComfyUI/models/vae (this is the Qwen-Image VAE, you might already have it)
Generation settings
Works at resolutions between 512^2 and 1536^2 pixels.
30-50 steps, CFG 4-6.
A variety of samplers work. Some of my favorites:
er_sde: neutral style, flat colors, sharp lines. I use this as a reasonable default.
euler_a: Softer, thinner lines. Can sometimes tend towards a 2.5D look. CFG can be pushed a bit higher than other samplers without burning the image.
dpmpp_2m_sde_gpu: similar in style to er_sde but can produce more variety and be more "creative". Depending on the prompt it can get too wild sometimes.
If going for a more realistic / painterly look, the beta57 scheduler (ComfyUI RES4LYF custom node pack) can help make better textures, since it puts more emphasis on low-noise timesteps.
Prompting
The model is trained on Danbooru-style tags, natural language captions, and combinations of tags and captions.
Use lowercase for tags, and spaces instead of underscores. Score tags are the only tags that use underscores.
Recommended positive prefix: "masterpiece, best quality, score_7, safe, "
Recommended negative: "worst quality, low quality, score_1, score_2, score_3, artist name"
When using a tag that is different between Danbooru and Gelbooru, prefer the Gelbooru version.
Prompt weighting works, but needs a weight higher than typically used for SDXL. Example: "(chibi:2)"
Tag order
[quality/meta/year/safety tags] [1girl/1boy/1other etc] [character] [series] [artist] [general tags]
Within each tag section, the tags can be in arbitrary order.
Quality tags
Human score based: masterpiece, best quality, good quality, normal quality, low quality, worst quality
PonyV7 aesthetic model based: score_9, score_8, ..., score_1
You can use either the human score quality tags, the aesthetic model tags, both together, or neither. All combinations work.
Time period tags
Specific year: year 2025, year 2024, ...
Period: newest, recent, mid, early, old
Meta tags
highres, absurdres, anime screenshot, jpeg artifacts, official art, etc
Safety tags
safe, sensitive, nsfw, explicit
Artist tags
Prefix artist with @. E.g. "@big chungus". You must put @ in front of the artist. The effect will be very weak if you don't.
Full tag example
year 2025, newest, normal quality, score_5, highres, safe, 1girl, oomuro sakurako, yuru yuri, @nnn yryr, smile, brown hair, hat, solo, fur-trimmed gloves, open mouth, long hair, gift box, fang, skirt, red gloves, blunt bangs, gloves, one eye closed, shirt, brown eyes, santa costume, red hat, skin fang, twitter username, white background, holding bag, fur trim, simple background, brown skirt, bag, gift bag, looking at viewer, santa hat, ;d, red shirt, box, gift, fur-trimmed headwear, holding, red capelet, holding box, capelet
Tag dropout
The model was trained with random tag dropout. You don't need to include every single relevant tag for the image.
Dataset tags
To improve style and content diversity, the model was additionally trained on two non-anime datasets: LAION-POP (specifically the ye-pop version) and DeviantArt. Both were filtered to exclude photos. Because these datasets are qualitatively different from anime datasets, captions from them have been labeled with a "dataset tag". This occurs at the very beginning of a prompt followed by a newline. Optionally, the second line can contain either the image alt-text (ye-pop) or the title of the work (DeviantArt). Examples:
ye-pop
For Sale: Others by Arun Prem
Abstract, oil painting of three faceless, blue-skinned figures. Left: white, draped figure; center: yellow-shirted, dark-haired figure; right: red-veiled, dark-haired figure carrying another. Bold, textured colors, minimalist style.deviantart
Flame
Digital painting of a fiery dragon with glowing yellow eyes, black horns, and a long, sinuous tail, perched on a glowing, molten rock formation. The background is a gradient of dark purple to orange.Natural language prompting tips
Follow standard English capitalization rules for character and series names.
If using pure natural language, more descriptive is better. Aim for at least 2 sentences. Extremely short prompts can give unexpected results.
You can mix tags and natural language in arbitrary order.
You can put quality / artist tags at the beginning of a natural language prompt.
"masterpiece, best quality, @big chungus. An anime girl with medium-length blonde hair is..."
Name a character, then describe their basic appearance.
"Digital artwork of Fern from Sousou no Frieren, with long purple hair and purple eyes, wearing a black coat over a white dress with puffy sleeves..."
This is extra important when prompting for multiple characters. If you just list off character names with no description of appearance, the model can get confused.
Limitations
The model doesn't do realism well. This is intended. It is an anime / illustration / art focused model.
The model may generate undesired content, especially if the prompt is short or lacking details.
Avoid this by using the appropriate safety tags in the positive and negative prompts, and by writing sufficiently detailed prompts.
The model isn't great at text rendering. It can generally do single words and sometimes short phrases, but lengthy text rendering won't work well.
The base version is a true base model. It hasn't been aesthetic tuned on a curated dataset. The default style is very plain and neutral, which is especially apparent if you don't use artist or quality tags.
Finetuning tips
Don't train the LLM adapter. My own training script, diffusion-pipe, lets you set llm_adapter_lr=0 to completely disable training it, and the example config has this as a default.
Other trainers like sd-scripts have similar options that should be used.
The LLM adapter processes the text embeddings before they get to the diffusion model, and therefore has an outsized influence on the generated images. The adapter itself contains a surprising amount of knowledge and is easy to degrade by training it.
Use a low learning rate. For a rank 32 LoRA, start with 2e-5 and adjust up or down from there.
As a base model, there is no aggressive aesthetic tuning or RLHF you need to overcome when finetuning.
The model has an extremely large and diverse amount of visual concepts baked in already. A light touch is all you need.
Example of a style LoRA, with dataset and configs shared.
License
This model is licensed under the CircleStone Labs Non-Commercial License. The model and derivatives are only usable for non-commercial purposes. Additionally, this model constitutes a "Derivative Model" of Cosmos-Predict2-2B-Text2Image, and therefore is subject to the NVIDIA Open Model License Agreement insofar as it applies to Derivative Models.
If you would like a commercial license, please email [email protected]
Built on NVIDIA Cosmos.
Description
Highres training is in progress. Trained for much longer at 1024 resolution than preview2.
Expanded dataset to help learn less common artists (roughly 50-100 post count).
FAQ
Comments (357)
Showing latest 276 of 357.
Compared to Illustrious, it does a much better job of following prompts.
However, I think its main drawback is that it doesn’t capture the artist’s style very well.
For example, even if I write “@cyclone \(reizei\)”, it doesn’t capture the style nearly as well as Illustrious does.
I hope this gets improved in the final version.
Not saying this version will be great for this particular artist, but one thing to note: the backslashes before parenthesis should no longer be necessary. "@cyclone (reizei)" should have a stronger effect. I'm definitely seeing strong effects from properly formatted artist tags, but haven't figured out how to make anything really high quality yet
@levantescor Thank you for the helpful information!
However, I tried removing the backslash, but unfortunately, I didn’t notice much of a difference.
I’m hoping this will be improved in the final version of the model.
I see great potential in fine-tuning prompts using natural language, so I sincerely hope this model continues to evolve and become even better.
This happens to me with most models, artist styles either capture very well or don't work at all regardless of how much images are available to put in the database. Sad because the more niche styles are really cool and just never work even with high image counts.
Re-trained a Preview2 character Lora on Preview3 so my impressions are all based on how the character Loras perform side-by-side.
PROS:
- Preview3 is a slight quality bump with the character Lora having slightly better character likeness and slightly better quality (eyes/hands) compared to Preview2.
- Generating at higher res (above 1024x) appears to be a bit more stable.
CONS:
- Preview3 seems way more sensitive to CFG. CFG 3.0 gives me worse results than Preview2, but CFG 4.0+ gives better results.
- From my testing, text rendering/adherence is worse than Preview2.
- Sometimes I preferred the Preview2 output more.
Overall I do think Preview3 is another net improvement to Loras but not as drastic as AP1 => AP2. Will need more testing and a good finetune though.
I tested too. I'll say that ill wait for a base cause the old prev 2 models barely work with mix results. i think this is a huge improvement but making lora with big databases makes it annoying to have to retrain
I'd say really big changes will only come through community finetunes, simply because there is only ever so possibilities with just the danbooru dataset to improve an already good working model while finetuners have a more selected targeted set and often with other sources and can tackle issues directly.
Waiting for lora support for anima!
Same. You mean the OSG right.
there already one, sd scripts local
You could train loras since AP1
@tosermepls I mean like onsite Lora support, to use in the onsite Gen
@AndroidXL It's available now.
The comments are nice and all but where are all the workflows? Not even 1 img2img workflow?
buddy pal of mine they're embedded in the example images
and upscaling is the same as sdxl, though it has some issues right now.
@sneedingonmyligma420 All of these examples are text2img not a single img2img. I have no idea what you are talking about.
@Zindabad i have no idea why you even need an img2img to be honest lol but yes they're txt2img.
For basic img2img just change the latent image input for the KSampler into Load Image + VAE Decode. The rest of the workflow is the same unless you're trying to manually inpaint or such.
Are you guys upscaling your images?
yes
@sneedingonmyligma420 which resolution and configs?
@zakotsuko my latest image has my entire workflow embedded, i don't use anything special, just that 2048pix lora that fixes the glitches that come with any kind of upscaling.
https://civitai.com/images/126765131
*you can remove the face detailer part of it though, anima doesn't really need detailers. i'm not sure why i was running with that recently..
@sneedingonmyligma420 Why the R rating :(
I feel like I get better results training LoRA at a 1024x1024 resolution rather than 512x512. What do you all think?
I hate ur corny styles
There's a thing I would like to see in the next version of this model: The ability of using e621 tags with this model.
Same, there are a lot of concepts on E621 that I'd like to see in Anima. But I can also see why people on the Huggingface Anima discussion don't want it in, there's a lot of low quality artists on E621. We'll probably have to wait for someone to make a finetune of Anima with E621 in it.
I think its better if someone made a furry fine-tune (I know, easier said than done).
This model does not have a large number of parameters, so the amount of information it can store is quite limited. If it is trained using low-quality datasets, a significant portion of its parameters will inevitably be used to store that information. As a result, its ability to retain high quality knowledge will be compromised,AKA catastrophic forgetting.
@Big_Soda, I know two people I think would accept the challenge of Retraining Anima with tags used in Danbooru and e621, and complete it...
1. duongve13112002, author of NetaYume Lumina (Neta Lumina/Lumina Image 2.0).
2. Chenkin, author of Chenkin Noob XL (CKXL), a model which is a continuation of NoobAI-XL (NAI-XL).
any bugs or other stuff post it here https://huggingface.co/circlestone-labs/Anima/discussions
I'm speculating loras from v2 will need to be retrained for v3, it seems the huge text generation improvements get bottlenecked heavily by style and character loras. Otherwise, this is a pretty big step up from v2 for text generation alone. Very promising!
*And yes i should at least iterate the obvious; there's nothing wrong with the need for retrainings, this is a WIP project. It is what it is. Just pointing this out because it seems numerous people are speculating a very strong backwards compatibility with v2 loras for v3 but i don't think that's the case.
**with that said, i'm also a tard that was literally prompting "laughing and talking" in my test prompt and didn't notice it was causing certain issues like degraded text and text boxes until it was already late. so that part is my bad. lol.
Bald Cory
@Big_Soda i love eeting yeeros
preview3 works quite decent at 1.5mp (1024x1535 or 1536x1024). Feels even more prompt adherent?
yea i feel it that way too, better prompting understanding
also: ultimate sd upscale works quite good with anima. Not sure if it was like that in previous versions, but jsyn
@Lostcut whats your setting on ultimate sd upscale? I always got bad result with my usual setting
@Leonmitchelly I tried it here for example, lower nodes https://civitai.com/images/126844189
(comfy workflow). Not sure how to make it in a1111 though. I almost didn't change anything, just changes to euler.
HEY BUDDY HELLO. IF YOU ARE GOING TO RELEASE A MODEL YOU INCLUDE MORE THAN JUST TEXT TO IMAGE. USING LANPAINT IS INCREDIBLY SLOW WITH THIS MODEL. PROBABLY SHOULD HAVE THOUGHT ABOUT MORE THAN JUST TEXT TO IMAGE BUDDY COME ON WHAT WERE YOU THINKING
ALSO WHY CAN'T ANIMA GIVE ME A BLOWJOB? AT THE VERY LEAST IT SHOULD BE ABLE TO TUCK ME INTO BED AND TELL ME THAT I'M A GOOD BOY. IS THAT REALLY TOO MUCH TO ASK FOR? CIRCLESTONE, MORE LIKE, SQUAREROCK, AM I RIGHT?
Shut up dumb fuck
@Big_Soda 💀
This is an incredible model even when using base Civitai generator. By far it has better prompt understanding and ability to include multiple characters than Illustrious or Pony! Looking forward to Lora support.
https://civitai.com/images/126924491
Considering this is a preview version the model already looks extremely promising. Best of luck with the project
You might have wanted to train this more on prompts and less on danbooru tags. And, I mean, a 0.6B text encoder? Why didn't you choose the 4B?
There has been a lot of discussion about the text encoder on the anima huggingface page already.
Here is an explanation by tdrussel:
https://huggingface.co/circlestone-labs/Anima/discussions/67#69af073d3ffb4d937d386f4b
Yeah, yeah, none of the CircleStone members know as much as you do—you're the one who truly understands all of this, the one who knows best how to train models.
@theally can you please add Beta scheduler to the inference pipeline. Simple and SGM Uniform are not that accurate unfortunately.
Is the anima training available in this website?
Can this model generate "basic" anime style like WAI-Illustrious with "anime screencap" tag?
This model knows 59,000 styles, not all of which look great, but you can find them. The top styles are the same as in NoobAi – Illustrious.
It should be able to, try "anime screenshot" or "anime coloring".
This model is freaking crazy
Does anyone have a link to a collection of all the characters and styles this model supports? It would be useful for wildcards.
I don't think there is a list like that. For styles you can look at the style tags on danbooru. Any tag with 150+ images on danbooru should work more or less. Also here is a style explorer for anima, but it's based on preview1 or preview2 I guess: https://thetacursed.github.io/Anima-Style-Explorer/index.html
For characters you can try the same as with style tags, but I don't know how many booru entries it needs to be consistent.
https://github.com/BetaDoggo/danbooru-tag-list/releases/tag/Model-Tags
it's organized into a .csv, each row is (tag, category, post count, aliases)
category indicates what type of tag it is; specifically 1 is style, 4 is characters
I feel dumb for asking this but how could it be adapted to run using forge or reforge set ups.
Does not work on vanilla Forge (and reforge I think). You need to install Forge Neo, if you don't want to use comfyui (I think there was another fork of Forge supporting it, but I don't remember...).
Use forge neo.
@RisingV @Madafada1991 I do appreciate the help. Thank you both. I do hate to say I bit the bullet and learning comfyui lol. Images kept coming out off.
Whats the difference between preview 2 and 3?
Quote from anima huggingface:
"Preview3
Highres training is in progress. Trained for much longer at 1024 resolution than preview2.
Expanded dataset to help learn less common artists (roughly 50-100 post count)."
This model seems incredibly promising, but Illustrious still handles artist tags far better. I can't jump ship until that gets sorted out.
Can you provide some artists that doesn't work? Just interested if there are some specific ones.
There are some artists that work only with additional keywords ("sketch", or "jaggy lines") as of now at least. Also some are kinda sensitive to quality tags
@Lostcut I haven't had one outright NOT work, the consistency is just all over the place. With Illustrious, I can slap the artist tag in and know every single image will have the exact style I want, even if the posing or whatever isn't what I am looking for.
@BingusChungus Do you have an example for the claim, that artstyles are not consistent with minor things like changing posings etc? Because I honestly haven't seen this. But I haven't tested dozens of artists either, only the few I like a lot.
I noticed that the effect of score_* tags is too strong and weakens artist tags.
Examples needed. Do you add @ before artist tag as is required?
@Volnovik Yes, I always use @ before the artist. The artists I use the most are Nyantcha, Sam Yang, Hungryclicker, Hintobento and Rizdraws.
@deitychaser The artists I use the most are Nyantcha, Sam Yang, Hungryclicker, Hintobento and Rizdraws. I usually have to do multiple gens to get one with a style even close to what I want. Face detail also feels worse for me, even with adetailer.
@BingusChungus Rizdraws has only 10 images on booru so that won't work and Hungryclicker are two words Hungry_clicker on booru. Other than that I would expect the others to work. What sometimes can help is to let gemini flash 2.5 caption an image of an artist you want to use and specifically task it to describe the style and then use these sentences in your prompt. In the realm of artists with 200-500 images on booru I also think Anima still has a way to go to improve. But Nyantcha with over 2k? Should be solid - you may wanne nail down the style with year tags or period tags if he has changed his style over the years so that you get a clean aesthetic and not the average of all his style periods.
@deitychaser I don't type Hungry Clicker as one word when generating, I use a space. Rizdraws is just "Riz" on Danbooru though, he has 750 images on there.
@deitychaser jsyn https://thetacursed.github.io/Anima-Style-Explorer/
There is a style explorer for styles
And yeah, it seems like all of these should work :(
It is weird that there are problems.
I also found some problems when i try to mix styles, loras are much better for that imo. Or prompt-control with [ | ], but it is still not a perfect approach.
@BingusChungus good luck
to be fair, this model feels like ILXL 0.1, with v-pred touch and not 1.1 or subsequent version. its good, but not at the best state, yet. well, its just preview version, once the full version is out, i believe its likely better.
If preview 3 is this crazy, I can not even imagine how wild the official release will be, just need more works to add recent characters, some fix with artist tags and eyes , but so far is so damn crazy
Anyone know if you can use Anima 2 lora with Anima 3 without any issues
is work for furry and dragon?
yeah but you need a furry Lora
Yes but not very well- as the other commenter stated it's better with a lora. Best to just wait for the final weights and resulting finetune :)
SOTA style faithfulness, but it's still undercooked, eagerly waiting for more, and hopefully a furry version
If it doesn't work for furry, I'm sure developers excluded this from dataset. There is no way model wouldn't know it with so much data on internet. Honestly, I think it's for better... If you want furry - you may do finetune or train a lora.
Just wait for a finetune of the final weights :)
I hope they exclude furry garbage from the dataset
Any furry character with 200+ posts on danbooru should be able to make in anima, some may need add support tags thought... And I advice follow the tag order template for better results...
Are there any usable ControlNet models at the moment?
How much VRAM I need to run this locally?
8, maybe. or less
I get under 8 GB usage with 1.5x hires.
I'm able to do it with 6GB, but it takes like 8-12min per image
the complete package safetensors (with vae and te) is 5.242GB, 6GB vram and 16gb ram pc should work.
the complete package But Q8_0 GGUF (with vae and safetensor te) is 3.467 and 4GB vram and 16gb ram pc should work.
the complete package Q8_0 GGUF unet and te with vae safetensor is 2.971
@bliedropbes theoretically, if CLIP would be unloaded after prompt would be encoded it may be possible with much less VRAM, so the full model would be loaded into VRAM instead of only part (as CLIP is occupying a lot of space in VRAM even when it's not used in generation, it's used only in condition encoding). The problem is I'm not sure it's possible to easily make such a chain where CLIP would be unloaded into RAM before Comfy would try to load unet into VRAM.
Results are promising! I'm looking forward to the official release :)
Is it possible to run it in forge neo? I get errors even when I added qwen text encoder and vae
im using it fine.
@dehainc sick response lmao
In top left select anima under UI preset, checkpoint to whatever anima you're using, VAE/Textencoder set to qwen_image_vae and qwen_3_06b_base
Also for sampling method I use ER SDE and schedule type I use Beta. No idea if this is best, or recommended, but it's what I use and it works
make sure to select the correct Type top lest /Flux/SDXL/SD and the other one - it didnt work for me in forge until i selected the "main" mode on top "luna or luma" or whatever
Anima's LoRA training is incredibly straightforward,I trained a perfectly fitting model in just 1700 steps, and preview3 has seen significant improvements across the board. Can't wait to see the official release.
how did you do it?
@ttaetherai I used DiffPipeForge(in wsl) as the trainer, with 104 sample images tagged by Gemini Flash, set repeat to 2, and trained for 8 epochs, ultimately achieving very good results. Based on my training experience with Illustrious, this number of training steps is actually quite low, yet the outcome was excellent,the model accurately captured my art style, so I decided not to continue training. The entire process consumed only about 9GB of VRAM; I used an RTX 5080, and the training took roughly over an hour.
@NGTYK thats the thing I was trying to do yesterday. I hate CLI. after installation (Provided Installation is successful), I prefer to do things in a GUI because typing takes too much time, and you have to know what to type. I DONT.
There are no guides and I dont like bothering people to ask them if they can show me exactly how, because that will take their time away and who do i even ask?
i spent 10 hours yesterday just trying to install diffpipe on wsl yesterday and dependencies were never installed. because i dont know where to install them, i had problems with permissions so they kept installing in home folder or whatever and I ended up with a 110gb vhdx with no diffpipe and I got fed up.
its not straight forward, and it should be. it has to be. theres no reason for it to be impossible.
Actually, I don't know much about WSL either, and I'm not familiar with using command line to operate these things. Anyway, every time I encounter a problem, Gemini 3.1 Pro can help me solve it.@ttaetherai
@NGTYK thats what i did, sort of, grok failed, kimi got further but failed, and then gemini got me as far as setting up docker, the ubuntu vhdx and install the repo, but the repo never ran. ended up with that bloated vhdx. and gave up.
but all is well now. @CitronLegacy Really came through with their new trainer. thats running without any headache.
if you interested - its on their github. much simpler to install than diffusion pipe. and works natively on windows.
@NGTYK can we talk about it in discord or dm?
when trying to run prompt gives " 'NoneType' object has no attribute 'clone' " error
The model is natively supported in ComfyUI
any gui to train lora that is windows based and local?
i tried doing docker and linux and its a mess and its just not worth the time.
something straightforward installation and gui with all the settings we need.
anything????
You can try @CitronLegacy Anima LoRA Trainer: https://github.com/citronlegacy/citron-anima-lora-trainer-ui
@FinisherStrike Thanks for the shout out! I just updated the app to work for training on anime-preview3. Also my colab anima trainer supports v3 too https://github.com/citronlegacy/citron-colab-anima-lora-trainer/tree/main
@CitronLegacy 🥺 THANKS. also thanks for the best pokemon loras since sd1.5
@FinisherStrike THANKS!
@CitronLegacy the git said initial release 7 hours ago
you have any idea what i did before that 7 hours ago? i pulled my hair and teeth out trying to get wsl/docker to work and it didnt. it didnt work. and I gave up and went to sleep to waste my sunday away.
@ttaetherai My app was designed to work with just python on Windows/Linux. You should be able to simply download my code then run "setup_for_windows.bat" and once its installed run "run_windows.bat"
Initial release was just a release before I upgraded to supporting v3 so I had a nice backup in case I wanted to get the old version. You can ignore that. Just download the zip from the green code button so that you download the main branch.
TLDR
1. Download code
2. run "setup_for_windows.bat"
3. run "run_windows.bat"
I dont have a windows computer so I can't test it but I've heard that it works on windows.
@CitronLegacy Hi, I sent you a DM about the experience. I also already trained my first test Lora, which went along without any issue. Thanks!
I would recommend to try preview1 - it's better if you want more diversity. Preview3 have a very strong "AI look". I don't like it.
Post this in https://huggingface.co/circlestone-labs/Anima/discussions
"the AI look" is any model that isn't suffering from undertraining haphazardness
it especially pops up if you use negatives that aren't strictly conceptual, e.g.
"worst quality, low quality, score_1, score_2, score_3, lowres"
will summon a very particular look
solutions include using an undertrained model for hallucinating a base image, then refining with a more trained model, among Other Things
preview 3 is mostly better, though I think we'll have to wait until grokking is reached to see what Anima is truly capable of.
Any program for anima 3 training ? In lightning.ai
for lora look at the standalone anima trainer. for checkpoint it seems comfy? idk about training checkpoints
I love it! However it still doesn't understand certain concepts like: Licking, anal play (it knows some, but it could use a bit more), and some light domination like foot on back, foot in mouth, etc.
Keep up the great work!
FR. only and only downside of this model is it cannot understand insertion stuff.
Is it normal for the image to have artifacts after upscaling (1.5x) in img2img?
Forge Neo*
yes you can't Always fix stuff with these
Yep, same as PonyXL and Illustrious, waiting for training on larger resolutions
Both online and local generation deliver incredible results. Thank you so much for creating this fantastic model. If only you could agree on a license with Civitai to train LORAS models, Anima would be a resounding success. Best of luck, and I hope future versions are just as good. 💖
can i dm u , im having prob to gen img with it
Can I use it on A111!?
You can use it with forge NEO
I tried train multiple loras recently and found out there are definitely better than illustrious one! Great Work!
me at first: meh whats the difference?
LLM-based text encoder
me: ok you have my full attention
i like illustrious but gawd i hate prompt fighting
hopefully this one will sort it out or at least lessen the frustration
my only nit pick is lack of base realism (kind of general all in one) support but i guess you get something at the cost of something else...
U actually can get good realism, if u finetune Lora or checkpoint I did train 1 Lora for test and the encoder did good job at caption realism model
@reilgun thats promising to hear
I'm really looking forward to the day it's finished. Compared to version 2, version 3 seems to have more accurate overlapping limbs when generating multi-person images. Is there any plan for Anima to support Chinese natural language? I've tried using Chinese but it doesn't generate the correct content.
I don't think so It would be great if all the models Support All the world languages
It's not a Chinese model. If you want Chinese text, use a Chinese model.
Also, at this moment, Anima does not generate text on images.
creator is great, but the problem of fingers and toes needs to be improved
It seems a matter of gitgudding.
I'm generating them just fine.
In my experience that problem is mostly introduced by loras
I've noticed they can be hit or miss, a lot of my good ones have kinda unusually long fingers too.
Also had a concept where... it kinda understood the idea but it just could not execute it with the hands lol gave me extremely deformed hands trying.
Let's try this bitch out. I'll write back if it's RIP Illustrious moment
Honestly, it's good but not that great. It follows natural language prompts 4/10 times. Simpler stuff is good, especially if you want you to have some text rendering. Groups of people turn out pretty detailed but the models has a hard time following your want.
For corn stuff, it's mostly danbooru tags all over again. And I think IL is much more consistent with it lol.
IL is still better, maybe even much better if you want to do corn stuff knowing danbooru tags.
At the very least, IL is fast af, anima not so much
So We can't use this with Illusitrious Loras ? damn
Illustrious is SDXL which means years old.
Anima is a completely different beast, even though it is only 2B parameters, it uses them incredibly well. It is in some ways better than Qwen for anime.
And compared to SDXL/Illustrious, Anima is a hundred times more detailed.
It's a different model architecture, so no.
@cooperdk yea i like that, the only issue is trying to draw unpopular characters with little to no online images to use s reference, Loras are needed to bypass this issue.
It’s getting better and better.I hope the full version will be released soon.
Nah need more
What scheduler would you recommend using with er_sde?
Made a test with preview1 version Anima CAT. Here is the comment I left on that model's page regarding this. In my opinion "Beta" scheduler works best (currently not available in civitai generator), at least in terms of prompt adherence.
There is an image post attached to the comment, where you can see for yourself.
No idea what i am doing wrong, but me and my friend spent HOURS trying anima but every result (except one) were AWFUL. we used a site for artist styles (anima 2B) and tried a number of different styles that I liked but all results given were like SD 1.5 but with zero negative prompts.
My friend knows what he is doing as he was the one tutoring me through with it but i gave up after around 3 hours of trying and yet people are saying it's the best ever and no one can top it. yet for me i went back to ILL and the very first generated image was great and high quality.
No idea what the magic is about with this Anima but it sure doesn't like me using it.
You definitely did something wrong. And you do not prompt this with booru tags, but in natural language.
@cooperdk wrong, you do both for best effect, first you write in tags, then you write in natural language.
in anima you need @ before artist name for strong effect but to tell the truth, it's not good. Copy paste people prompts (+seed, scheduler and sampler) and see if it works (reproduces) their image. If it does, good, means you were just prompting in bad way (=bad results). If it doesn't = you set up your workflow wrong, drag image from site and drop it into comfyui to copy workflow (from anima prev)
Anima does something better some, worse, certainly its only model that can do something coherent at resolution as low as 320x320, and it can do good quality renders (good enough) at 8 steps. But it's 4 times slower than Illustrious, and unlike chroma/flux, it's not as good with text. But chr/flux are 10 times slower than illustrious so fair game?
I have the same problem this looks like basic SD 1.5 to me too
@AltairTheArc I really like this about Anima because just natural language alone can be kind of a pain in the ass and Danbooru tags are very simple. So it is really the best of both worlds.
Anima quality is 1024x while Illustrious 2.0 is like 1536x.
Theres a whole bunch of other reasons. But basically as of now Anima is no where near as close to Illustrious as most people think.
Its a good model and all, but it fits somewhere between Pony V6 and Illustrious in terms of quality.
@LatteLeopard that's what i see too, illustrious is still the best imo but, with what people are saying anima does well with multiple people in a single image which i try to do sometimes with regional prompter in illustrious and a lot it doesn't go so well.
but other then that I don't see anything so amazing that people are saying about it like "it will change anime forever" kind of vibe people are giving off about it. especially since we're still in the stage where AI cannot identify a lot of characters from anime,games,visual novels and so fourth.
@atmogenic
lmao, yeah your experience is basically what most people feel. They try it, get some mixed results and then go back to ILX or Pony.
Honestly, I dont think it will beat illustrious, and im starting to think that not might be their goal after all.
This model just wants to focus solely anime illustrations and be the best at doing so.
I imagine its final release will compete with Animagine XL
This should put into perspective how intensely the focus is on anime.
Anima is a 2b parameter model thats mostly anime, it does use LAION-POP and deviantart data. LAION is considered cutting edge, but they pruned anything that isnt 2d illustrations.
Also for some reason they used 2MP(1024x) images while ILL uses 4MP(2048x)
Illustrious is around 2b parameters too, but its a powerful mix of e621, danbooru, pixiv, etc.
its based on SDXL, which is reliable, predictable and comes with LAION datasets already trained.
Anima uses a mix of Qwen(for text encoder + VAE) and Cosmos-Predict2,
the base model they trained on was a custom version of Cosmos-Predict2-2B-Text2Image
Actually, anyone can try training additional datasets from huggingface onto it. That would give it more danbooru + e621 data. Also it would increase the quality, because the datasets use 4MP images.
yeah i think but needs months
Thank you so much for your hard work.
The model looks amazing.
Also, would there be an "XL" version of the model? It's only 4GB now, and it seems like there's still a lot of room for growth.
How to do a 2nd pass with this model? I upscaled the image after the first pass using an upscale model, and then i did a low denoise 2nd pass on the upscaled image, but it was completely distorted and pixelated. Will this only be possible in the full version? I saw the hires lora but 2048x2048 is still too small. Sorry but images without upscale + 2nd pass to refine do not look good. A 1st pass image is unusable, too small, too blurry, too pixalated.
Post bugs and bad stuff here https://huggingface.co/circlestone-labs/Anima/discussions
Just want to ask : " the top score is score_9 or score_7 ?" Thank you so much ~
score_9 just like pony^^
just want to say thank you for the new model! It looks amazing and has a lot of improvements. Thank you for all the effort from your team.
May I ask why the model generates a penis in green?
It's a Greenis
Orc dick supremacy, the model is racist
what is dataset cutoff for v3 and this model is smaller and better than sdxl family because of its llm text encoder but it also makes it equally twice as slower then sdxl which is a big let down.
likely dit architecture is the reason
First, thank you very much for the models. I even stopped using IL as my main model type after seeing how powerful Anima can be. But am I the only one getting better details and way better prompt adherence with Anima Preview 2? I tried finetunes, too, and in all cases I'm getting better results with models based on Preview 2 instead of Preview 3.
Hi, is it possible to use "Anima" on AUTOMATIC1111?
If you are referring to the web UI, then yes, I am able to run it using ForgeNeo.
https://civitai.com/images/127669768
As shown in this image, you need to specify the Preset, VAE, and Text Encoder.
Therefore, you must use Forge Neo, which is a fork of sd-webui-forge-classic.
Really looking forward the full model. This really reminds me of my experience with old Dalle, and that could just be nostalgia talking, but I'm really enjoying this one!
Are you sure you didn't use any synthetic data? I feel that several of the generated images have fairly obvious Niji features.
It's what happens if you 3d and 2d data in a model - the average look is the known slop look.
forge doesn't recognize model.
The main branch of Forge hasn't been updated in like two years. Try using Forge Neo; it's regularly updated and has anima support: https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
Just tried. This model do not support artstyle fusion as much as how IL models do. It's gonna largely relies on the artstyle LoRa in the near future
Also the author is somehow planning to delete artist tags for further commercial plans, so guys, don't be overconfident
Through experimentation, I've found out that tagging art styles such as "[@artist1, | @artist2, ]" can merge art styles in a semi-consistent manner compared to the more common "@artist1, @artist2, ". It's worth a try.
As for the "deleting artist tags" thing, got a source I can read? That's genuinely upsetting to hear and it'd be a huge shame if true
@Cazex Sry I was mistaken. It's character names may got removed. Cuz he's actively seeking commercial licenses, so considering midjouney that might happen. But you're right about how it's unlikely to be like that
https://huggingface.co/circlestone-labs/Anima/discussions/37
@EmonDante LMAO you have brain damage
@Cazex yup, use it too. Also helps [@artist1 | ] just to make some styles weaker (some styles just tend to overwrite pretty much anything)
You use a1111 or comfy?
Guys, TDRussell just flew over my house and told me he's going to stop making Anima in 2 more weeks because he's bored. Then he spit in my mouth and told me that no one would believe me.
@EmonDante Ctrl+F there's literally 0 mentions of "name" or "character" in that entire discussion. What are you smoking?
It makes nearly no sense to first train a model on tags such as artist or character names and then remove them... you would be nearly back at square one in the training process.
Trying to use the model and it keeps telling me that it ran out of memory or another weird bug with a lot of words. This is on the example image too. Don't know what I'm doing wrong
why image to image Will turn gray Or add noise.
good when it's good, but very unstable, hands can be decent or SD1.5 tier depending on action and seem to get worse when adding artist tags?, sometimes randomly piss filters all over an image and prompting cool hues turns characters into na'vi, but i'm really excited for future versions
Terrible default art style. 0 art style support. This model is astroturfed.
there is art style support, you just have to add an @ at the beginning of the artist tag.
I DON NOT AGREE !
tfw civit users are too retarded to read documentation
you can train your own art styles bub
Even if you ignore the fact you can just prompt for an artist style, why would you want a model that actively has a strong style bias with no artists? That's how you end up with WAI or Pony...
In my opinion, the effectiveness of artist tag is still quite limited compared to other models. However, its natural language processing capabilities are truly outstanding, and I look forward to the official release.
I’m not sure if it’s just me, but when generating Preview 3 on Forge Neo, it seems best to use LoRAs made after the release of version 3; LoRAs trained on Preview or Preview 2 sometimes cause the output to break on Forge Neo. LoRAs really are crucial.
you dont have to make your message in bold letters, you are not special.
What are the advantages and disadvantages of this model compared to Illustrious?
+ great prompt adherence
+ prompt not limited to booru tags because of natural language
- still on beta phase
- stuck on 1024x1024 pixel
details up x 1000
lo malo no hay muchos modelos y lora por eso seguiré otros
The bad thing is not in Illustrious
After extensive testing 1920x1080 12 steps with your own well-trained lora is all you need for Anima. It just works the same way as the official Turbo lora. Absolutely a waste to run 30steps and you don't need the official highres lora to boost to 2160x2160 square coz the native 1920x1080 limit is much more practical in production stage.
I think this model have been over trained on some characters.
running prompt like this ''masterpiece, best quality, score_7, murakumo_(kancolle) , sailor collar, skin tights, sitting on boy, straddling, 1boy, boy fully visible, '' negative: ''muscle, worst quality, bad quality'',cfg 5, 30 steps, ER SDE beta on forge neo will give you pov, from behind image of that girl, no variety just almost same pose.
dark magician girl have the opposite case, she have many variety image of same prompt I tried.
its giving you exactly what you're describing, which is nothing at all. you prompt 1boy and nothing else - its giving you a trap murakumo straddling the only thing it could with a 1boy prompt - pov
@plan_truster doesn't matter if you add 1girl, 1boy, hetero, same result. I have more prompt if you want to see them. The same problem rarely happens if you try on character like dark magician girl that I mention.
Add pov to the negative and "full body" instead of "boy fully visible" to the positive. And then add an angle. Like your issue is that its biased towards from behind - why dont you prompt something different like from side or straight-on. Or add from behind to the negs? Its not that complicated, yet you - without even trying - race to conclusions about model being overtrained?
@deitychaser my man read my comment, again. I did not even said anything about it being over trained in general. I even used another character like dark magician girl that work perfectly fine from the same model. Other base model like noobai does not have the same problem with murakumo where the same prompt (not the one I mention) give different variant of result of the prompt and that is something i notice and wanted to point it out for improvement.
My guess is that prompt is too vague, add some description to the boy or the scene like:
pos: "score_9, score_8, score_7, masterpiece, best quality, amazing quality, very aesthetic, high resolution, ultra-detailed, absurdres, newest, safe, 1girl, 1boy, murakumo \(kancolle\), kantai collection, sailor collar, skin tights, sitting on person, sitting, on chair, straddling, full body, the boy is sitting on the chair"
neg: "bad quality, worst quality, low quality, score_1, score_2, score_3, lowres, distorted, signature, watermark, patreon logo, artist name, poor lighting, worst detail, multiple views, logo, patreon username, web address, dated, bad hands, extra arms, extra legs, amputee, mutation, extra digits"
@FinisherStrike Look like removing masterpiece and best quality and adding score_9, score_8, score_7, gave better result. I tried that on fate grand order character reines el-melloi archisorte. it gave better result.
Any recommendations for imitating poses something similar to ControlNet(Forge NEO)?
I made a bit of experiments:
Using the same parameters from TensorArt on my local pc results in two wildly different images. (I'm using Forge Neo).
I assumed the possible culprit was the VAE, so I ran some tests, but besides 'qwen_image_vae' on my local computer, all other VAEs result in black or grey images.
Using 'qwen_image_vae' produces different images compared to what I was getting on TensorArt, so I tested again, and I ran using the same seed and parameters, but different VAEs on TensorArt, and I got identical images each time.
Therefore, TensorArt lies in what parameters it is using with this model and does not tell you what it is. Kinda bummed because it produces better result that what I'm getting on my pc.
very GOOD model!!!!!
it can understand nature language.
thanks a lot!!!!
BTW,
is it possible to train rare prompt, like bodypaint/scat?
(AI translation)
Compared to my go-to model WAI-illustrious-SDXL, Anima has clear advantages in natural language understanding and artist style replication.
However, there's one major issue with Anima that I find quite regrettable, though I'm still looking forward to future versions.
The issue is a bit strange to describe.
WAI seems to have much stronger style mixing capabilities. I can easily blend the styles of different artists, fine-tune the current style, or create something between two styles.
Anima can mix styles too, but the result feels strange. It seems like it's not just mixing visual effects, but also associating semantics.
Here's an example using akai sashimi and modare – both artists produce work that is simultaneously realistic and not realistic.
Akai sashimi's work is very realistic in subject matter choice.
Modare's work is very realistic in texture rendering.
But both share a common trait: they generally don't draw nostrils.
When I mix akai sashimi and modare's styles in WAI, the result is very intuitive. I don't feel any issues at all.
When I do the same mix in Anima, the result is… odd. It's like it kind of looks like them, but also kind of doesn't. It feels like someone who has never actually seen these artists' work is trying to guess and draw based on someone else's verbal description. And the probability of characters having nostrils is strangely high – something that simply doesn't happen in WAI at all.
(AI translation) Yes, I am experiencing this issue as well. And, if you pay attention to the feedback from the community, you'll see that you're not alone in this matter. The problem I encountered was not only that the artist's style could not be blended together, but also that some artists (such as @michiking) still had subtle differences in the AI-generated image style even when used alone.
Artist mixing is a consequence of CLIP based models like stable diffusion XL. (Pony, illustrious and all of it's mixes) Every model after stable diffusion abandoned CLIP in favor of LLMs for their understanding of context and precision. Regardless of which newer model you use, you won't get artist mixing like SDXL again. Your best bet is to use something like Prompt Editing, it's not the same but it's better than what you get with just listing artist. I wrote an article on the subject here.
Yes, artist mixing is the legendary "bug becoming feature."
Deviantart dataset seems to have ruined the model. If you preview steps 2-4 there is a clear deviantart logo right in the middle of the image. Artist and style tags are also significantly less impactful compared to p2.
how to emphasize the prompt to make it weak or strong? like I want to make some artist tag only 0.5 strength
i use (@artist:0.5) and it seems to work
For some reason, v3 it really doesn't play nice on classic. It outputs things, sure, but does not output things as accurately as in Comfy.
Any recommended Style Lora guides? I see this account posted a "Greg Rutkowski Style" guide. Is that the best settings?
i think that was posted for an example lora with training dataset and settings. as op is owner of https://github.com/tdrussell/diffusion-pipe training toolchain.
Very good model, I have high hopes for how it will shape out to be, but preview 3 is already very competent. Good work guys and gals!
I've strangely found this model generating significantly better and more consistent results at specifically resolution 1248x1248.
Can anyone else check this out and let me know how it goes?
the new model in general generates higher quality better, i downloaded the hi-res fix to do 1080p and 1440p wallpapers and i did some 1080p wallpapers first and it was perfect, just to see i had a lora for 3dcgi enabled for illustrious and not the anima hires fix enabled lol, can do around 2.5MP decently in my tests
This model surpasses existing SDXL-based models such as Illustrious and NoobAI in several areas, including natural language prompt understanding, English text generation, separation of multiple character features, and fidelity in character and style reproduction.
Thank you for sharing such a great model!
I’m looking forward to the release of the full official version.
Hoping for a Anima-preview lora online training, it would be really cool 👍 this model is too peak
the creator want to sell license, so it will take a while for online lora training. Honestly, this model made me to train a lora on my 5 year old pc.
I haven't been able to get good results yet with lora but I've been using someone else's preset since I'm unfamiliar with this model
In the next update, I'd like to see more characters (specifically from One Piece) and improvements to the art style. Additionally, the '@oda eiichirou' tag requires adjustment: the style strength is currently too low, and there is a persistent white border that cannot be removed even with negative prompts.
If you look at the images on danbooru tagged with "oda eiichirou" there are a lot with white borders, so there is little you can do about it I guess. You can try using @oda eiichirou \(style\) to get a similar style .
I like that this model is not only anime style but also have decent cartoon art style in it without writing any artist.
it can do more of a western illustration style too which I enjoy, 2d models often have that ugly AI slop look to them, anime models make the eyes too big. imo it can be difficult to get that kind of style repeatedly.
@minthe the creator said he used 800k images from other sources. I assume he was talking about cartoons/western style
In Forge-Neo, I'm running into a black image and a message that says, "Encountered NaN in Latent".
I'm using the expected text encoder and VAE.
I couldn't find anything, looking around, does anyone have any leads?
I'm not entirely sure but I think it might relate to the sampler and scheduler combo. If I do Euler A and Automatic it gives me a black image with the same NaN error, but if I do Euler A and Normal or Simple then it works fine, same prompt and everything else.
@bloodmetaleyes6727
That seems to be exactly it. Huh. Yeah only Normal and Simple work and all other produce a black image.
Thanks for the response dude, you're a saint!
I hope you guys continue to support natural language prompts in your models!! it does great job!
The model is very promising, but it's clearly a bit rough around the edges. We're all delighted with the consistency of the scenes and characters out of the box, but there are a couple of nuances worth noting:
1. The dataset is obviously poor, making it very difficult for the model to grasp many concepts without detailed descriptions, but I understand that this is a downside of the preview version.
2. The very limited context means the model's attention starts to blur even with four characters with different eye colors, hair colors, and facial expressions. This is especially true with just portrait-style focus on the faces, without complex poses. Experiment and you'll notice that character 1 is more or less fine, but the emotions can blend with those of character 2. Characters 3 and 4 simply average out, and sometimes the 3rd and 4th characters swap positions, ignoring the prompt. These are all clear signs of a lack of attention on the model's part, even in such a relatively simple scene for a transform base.
3. I don't know why, but the model reacts very strongly to specific triggers. Just one short prompt can radically change the model's understanding of the scene, while the model might simply ignore a detailed prompt.
But despite all the criticism, the guys did a fantastic job. The basic goal of bringing consistency capabilities into the hands of the public is invaluable. And as I myself have noted, most of the model's drawbacks are more related to optimization decisions or the experimental nature of the model previews. I wouldn't call Anima a full-fledged "Illustriuos Killer" yet, but the model already offers capabilities that no anime SD model has offered before.
in 2. I think it more to do with the text encoder, while it is way better than sdxl text encoder, it struggle also with short comic page so it is not surprising it struggle with multiple characters.
I think it more to do with the text encoder, while it is way better than sdxl text encoder, it struggle also with short comic page so it is not surprising it struggle with multiple characters.
@Dewal76 Yes, that's right, Qwen 0.6B clearly has too little context, which is why the model starts to generalize and forget details when describing in detail. It would seem that we've finally gained the ability to create multi-character scenes out of the box, but the limited context spoils a lot. However, if we move to a larger Qwen model, Anima risks becoming too heavy and less accessible to the masses.
Not to dampler your expectations for the full release, but what you are asking for is probably not feasable for such a small model. It's a locally deployable model with a tolerable generation time on average hardware. But that doesn't mean that regional prompting and other nifty techniques for multi character scenes have become obsolet. Use them.
I would like to point out that as per the HuggingFace discussion on "prompt weight adjustment", this model does in fact support messing with the weight of input text.
It is just that instead of "1girl, (gigantic breasts:1.5)," to get a ridiculous result in SDXL, you instead have to specify "1girl, (gigantic breasts:4.5),"
different things are more sensitive than others (e.g. '@artist' tags), you can even play with entire phrases like "(cute goblin girl surfing on a giant leaf in a cloudy landscape:1.15)" to see what funky results it summons
and yes, this also goes for NegPiP in comfyui-ppm, where you can do hilarious things like ", (worst quality:-7.5)"
turns out this is just as much FUN as SD1.5, except now we have several really strong DMD LoRAs that can be adjusted to make the composition coherent, as well as a CFG distillation, surprisingly good 'concept isolation', and a model that is not prone to collapsing when you push it
so, really, it's even more fun, ugh.. so many sillies to try
Well done for duplicating this lifehack here. I recently read about it myself. The ability to adjust the promt with weights really does greatly increase the usability of the model, especially considering that the model is extremely uneven in its perception of token weight.
So is this model so good already? stable? I tried it and it wasn't good because I didn't apply what you mentioned, unsatble models like NoobAI Chenkin etc models or Newbie and Lumina models were not useful for me, but it seems that with what you said Anima can be controlled better, in this case it would be really helpful.
@Suomsoh This model is shockingly good, as we get to toy with schizo SD1.5 style prompts, except now with a model that doesn't collapse immediately, has surprisingly good composition and concept isolation, and doesn't have a 75 token limit before quality of outputs goes down tremendously.
https://civitai.red/images/128869511
Here's an example image with a very distorted prompt, yet it could accomplish the challenge I set for it
furthermore;
https://www.dropbox.com/scl/fi/ewxt950rxoh96xiezlnun/ComfyUI_anima3-stuffs-minimal.7z?rlkey=5d3jm165295pwnzirp71vl4yf&st=l2lqtkxf&dl=1
Here's the necessary comfyui-ppm, LoRAs, and example images of a working wacky-prompt workflow in one package
@Suomsoh and yes, on its lonesome without Wacky Setups(tm), it's rather hard to get a convincing output from the model when you stray from "Your Average Waifu Printer" territory
This is a version, but in illusion.
Is this somehow working in Automatic1111?
Imagine what version 4 can do
I hope this model can achieve similar results to NovelAI.
where do i put this in comfyui? in checkpoints it gives me error
diffusion models and load diffusion model node
is there a list of characters, and concept tags this recognizes ? Also please refine poses for future versions. Its obsessed with ass shots.
Maybe put "safe" in the prompt to avoid provocative poses?
What does the dataset look like for this? I'm uncertain how to tag/caption when training LoRA for this since it uses both and tags don't seem to like the periods used in captions. At least not in edit software.
Their official hugging face page have some details about the tags it was trained on: https://huggingface.co/circlestone-labs/Anima
Mainly trained on anime images with additional datasets of non-anime artistic images (ye-pop + deviantart). For captioning especially refer to this comment by tdrussel (the person behind anima) on the anima huggingface page: https://huggingface.co/circlestone-labs/Anima/discussions/9#69812bd9511f2d67952084ae
Using tags only in lora training seems to work fine in my experience.
@RisingV Thanks! How many steps have you been doing in training for Anima by the way? For awhile, rule of thumb was 2000 steps but I'm getting really mixed info with optimizers and learning rates. For instance, a few people recommended I use 0.0002 or 0.0001 with Adam but that hasn't worked out well at all with that total of steps.
Details
Files
anima_preview3Base.safetensors
Mirrors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
animaOfficial_preview3Base.safetensors
anima-preview3-base.safetensors
animaOfficial_preview3Base.safetensors
anima-preview.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
animaOfficial_preview3Base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima_preview3Base.safetensors
anima_preview3Base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
anima-preview3-base.safetensors
Available On (2 platforms)
Same model published on other platforms. May have additional downloads or version variants.












