Photanima is an experimental finetune of Anima Base v1.0 to see whether it is a viable architecture for photography. Spoiler alert: it totally is.
Turbo LoRA baked in. If you're on a 30-series GPU, I recommend using this with my INT8 Toolkit + INT8 Lazy Torch Compile node for wicked fast gen times. All demo images generated with that combo.
❤️ If you enjoy Photanima, you can help offset the cost of training:
🤓 Technical details
Trained on ~1500 images for 27,500 steps. This is my Snakebite 2.3 dataset with around 100 new images and some caption cleanup. Training took approximately 24 hours on a Geforce 3090.
Pros:
Extremely fast.
Extremely good prompt adherence.
Anatomy is pretty stable. If it screws something up, changing your steps by +1/-1 usually fixes it.
Supports up to nearly 2MP with little-to-no distortions.
At first, I noticed that Photanima's style was inconsistent - it had a tendency to regress toward a cartoony/CGI look as my prompts became more complex. I was able to mostly overcome this by splitting Photanima into constituent content and style blocks, then boosting the style strength to around ~4.2 in ComfyUI.
Style is pretty consistent now, but there are some notable drawbacks.
Cons:
There are significant biases from my limited dataset. For example, you have to push your prompts pretty hard to steer the model away from its default facial features/racial biases. Yes, I have a type. I suspect this won't be a big issue for LoRA training.
It struggles with certain artistic terms like
silhouette.Microdetail quality is somewhere between SDXL and ZIT. Honestly, it's really good for a 2B model. Two-step upscaling with Anima doesn't help much, but I'm sure the results would be amazing if you sent a Photanima image to a different model for refinement. Or if that's too much work: just add a little film grain. It does wonders and requires no extra VRAM.
Model is a little too horny for its own good.
🛠️ Recommended Settings
8-10 steps with v1.1 Turbo, or ~12 steps with v1.0 Turbo.
Euler or er_sde sampler. Euler is a safe pick, but er_sde might produce better details.
Simple or Beta scheduler.
CFG 1.
Preferred resolution: 832x1216 or 1040x1520.
For maximum realism, begin your prompt with
real life photo of...
Base model settings:
30-50 steps.
Euler sampler.
Simpler scheduler.
CFG 4-6.
Use a bunch of fluff tags like
masterpiece, score_9, absurdres, best quality, highres, photo \(medium\), real life. Note: do not do this with Turbo.
🗺️ Roadmap
I'm pretty excited about the potential of Anima, but let's be clear: I'm not claiming that this checkpoint is a "ZIT killer." The correct model to compare this against is SDXL/IL - and I'm confident that Anima can dethrone it with enough community attention.
Directions I'd like to explore next:
There are a handful of Anima "detailer" LoRAs on Civitai. These are not intended for photography, but with enough block pruning, you never know. The right mix could go a long way.
I suspect doubling my dataset to ~3k images would make a big difference, especially if I can collect a wider range of faces, body types, and textures.
I'm eagerly awaiting the release of Anima Turbo 1.0. The current Turbo solution is based on Preview3 and I think it's holding back this model's potential a little.
I'm also looking forward to Anima support in OneTrainer. It will make trying experimental configs a lot less of a hassle compared to kohya-ss. For this v1 run, I stuck with safe values (prodigy, 1.0 LR, no fancy flags.)
Thank you. As always, I look forward to your feedback. Please share the model and upload some images to help it gain traction.
Description
Initial release
FAQ
Comments (9)
Example with/without film grain + 1xSkinContrast-SuperUltraCompact upscale model to help fight against the "airbrushed" look:
https://imgdiff.net/s/fcf51a5891c6ef72369e568ef5ca00a5
These are still early days for Anima, and hopefully such workarounds won't be necessary in the future.
prolly better as a lora, right? and no turbo
Looking for feedback: I'm testing a base (non-Turbo) version of Photanima and it seems that I need to pull way back on the style blocks to avoid the "hyperreal" look with that mix.
Thoughts on the following images? Thanks!
I like the look on the right. But so far I've found that the turbo version of this model has drastically different outcome from the non-turbo verison. Like the turbo version has a more natural look than the non-turbo version.
I don't have a lora to test character consistency, however. Turbo version seems to like red color accent.
The one on the left is hotter.
One on the left looks more correctly lit and looks less plastic.
Handles high-fantasy realism very well! It does favor the itty-bitty-titty-committee though. I had to use "huge" instead of "large" or "big" breasts in my prompts to get anything close to what I would normally consider to be "medium" breasts.
This is so true 😂
There are definitely other types of women in the training data, so I was surprised to see how strong the bias is toward A or B cups. I'll need to see if it's a captioning problem.
But yeah, it's nothing "huge massive boobs" can't fix. Repeat the phrase a few times for taste.
Can Anima work on an old 1080TI, and if so, can someone share a Workflow for that?







