ZootVision - Eta - CivArchive (CivitAI Archive)

ZootVision - Eta - v6.0 - Zeta

NSFW

What is this?

I would describe it like so: an abnormally versatile SD 1.5 model with extensive custom training done exclusively at 1024px and higher (thanks to "bucketing"). Built up in a clean, additive, iterative fashion on an ongoing basis thanks to CivitAI's handy online Lora trainer. Can do everything from pretty landscapes to hardcore booru-tag based NFSW in pretty much any style. Not specifically just an anime, realistic, or semirealistic checkpoint, rather moreso whichever of those you want it to be at any given time. All showcase images are direct generations made without any use of detailing or upscaling whatsoever (i.e. you should treat this like an XL model basically when using it), and include full metadata.

How do I use it?

You can use either natural language or booru tags (with spaces, not underscores). I tend to use both simultaneously, as in mostly coherent sentences but with many of the words and phrases being specific tags that actually exist. See the showcase gallery for a variety of examples. In terms of resolution, it is at the very least completely pointless in my opinion to ever go any lower than 768x768 with this model (as 100% of my training is done at 1024px without downscaling or cropping anything).

Personally, I do not ever generate lower than 1024x768 or 768x1024 with this, and more often actually do 1216x832 and 832x1216 when it comes to non-square-format images. For square format I personally stick to 1024x1024. Again, you can download my showcase images at their original resolution with full metadata to get a better idea of what this thing can do, as it is also trained on some less common "exotic" aspect ratios / resolutions too.

Also note that if you're prompting for 2D-style images, this model DOES recognize a large selection of "by whoever" artist tags (some stronger than others), so if there's one you have in mind just try it.

Tip: generally speaking, SDE samplers provide better results with this model if you're going for realism. I personally am a big fan of DPM++ 3M SDE GPU Exponential, at around 4.0 - 4.5 CFG. For anything less realistic, however, you may also want to simply try Euler Ancestral (or very occasionally DPM++ 2M Karras) at around CFG 7.0.

Do masterpiece, best quality, high quality, worst quality, and so on exist in this model?

Yes, but their impact on the image is much smaller if your overall prompt is for realism or semirealism, they have the most noticeable impact specifically on 2D-style images. detailed background and simple background specifically however DO both have the impact you'd expect on all types of images, generally speaking.

V7.0 Eta Details:

Better realism, and prompt adherence should be I think the best it's ever been. Really happy with this version. VAE baked in as always.

V6.5 Zeta Plus Details:

It's not quite what Zootvision V7 Eta is intended to be, yet. But it makes some nice, perhaps subtle, improvements. I tried to stress the actual depth of the model in the showcase gallery images this time, a bit more. VAE is baked in as always.

V6.0 Zeta Details:

Improved basically everything TBH. Did all the stuff I talked about in the comments, and a bunch more. Made some pretty weird showcase gens just to kinda show off what this thing can actually do a bit more, lol. VAE is baked as always. Also don't forget that this model does in fact know a very large amount of by whoever Booru-format artist tags, it's not only the specific ones you've seen me mention before!

V5.0 Epsilon Details:

Trained for an additional 10,000 steps on a variety of subjects (all of photorealism, NSFW, and anime have been at least somewhat refined) against v4.0 Delta. This version also introduces an Ideogram style dataset, which can be triggered by using 'by ideogram' in any prompt. See the showcase gallery for some examples. I think this is a pretty solid improvement over Delta, hope you enjoy it! VAE is baked in as always.

V4.0 Delta Details:

Two additional datasets merged in (one for further enhancement of photographic images of people and places, one for some experimental "tricky prompt" rich captioning stuff), both trained on V3.0 Gamma for a combined total of 9040 steps. VAE is baked in as always. All data in the new photographic dataset was tagged with photo \(medium\) in order to build on top of the model's existing understanding of that tag. This is definitely the best version yet, hope you enjoy it!

V3.0 Gamma Details:

1000-image "aesthetic" dataset (trained for 10,000 steps on V2.0 Beta) merged in. This dataset can be optionally strengthened by using the phrase very aesthetic anywhere in your prompt. This version has a VAE already baked in, as always.

V2.0 Beta Details:

Merged with 1000-image "NSFW Enhancer" dataset (trained for 10,000 steps on V1.0 Alpha). All images were at least 1024px on at least one side, up to a maximum of 1216 (for XL-style 832x1216 portrait / 1216x832 landscape images, of which there were a fair number).

V1.0 Alpha details:

My (incomplete) attempt at a truly general-purpose high-resolution-focused SD 1.5 model, in the sense of anything from pretty landscapes to hardcore booru-tag based NSFW porn.

Uploading to CivitAI in the current state basically for the sole purpose of using their Lora trainer for a few more 1000-image datasets I need to get trained and merged into this thing. Feel free to try it out regardless if you like (it know many characters, see e.g. Jinx in the showcase), however expect relatively different results from later / the final version.

General (always relevant) details:

DO NOT blindly assume that Clip Skip 2 is always "correct" with this model, it is not really traditionally NAI-derived at all. Really I'd moreso recommend just trying either Clip Skip 1 or 2 if you've found a particular seed that you mostly like but isn't quite "there" for a given prompt, as in my testing both give good results under different circumstances.

Description

FAQ

Comments (17)

ZootAllures9111

Author

Jul 1, 2024· 1 reaction

CivitAI

A couple of the showcase images for Zeta are uh, "blocked", we'll see if they get unblocked I guess. Hope people like it in any case!

EDIT: Yeah, they're unblocked now.

Simpson3453Jul 4, 2024

CivitAI

Your training is definitely working! It's interesting to see that even just your text encoder used with other unets improves those models too. Can you keep going? It's amazing to see how far SD1.5 can go!

ZootAllures9111

Author

Jul 4, 2024

Glad to hear you like it! And that's interesting about the encoder, what other models have you tried it with?

Simpson3453Jul 16, 2024

@diffusionfanatic1173 one of the best 1.5 models on civitai for high resolution compositions, Nyan Mix. It's a really well trained model https://civitai.com/models/14373/nyan-mix

ZenythJul 5, 2024· 1 reaction

CivitAI

I've only tried Epsilon and Zeta, but these models don't feel like Stable Diffusion models, they feel like their own thing, a breath of fresh air, which is great! I have been matching them with other models that are good at what they do, and ZootVision has been hitting it out of the park! I guess I can call it my favorite model now, when you compare it to other 3D models it puts them to shame as they don't even have backgrounds for some outputs!

I like Epsilon better than Zeta because of the default facial expressions, when you don't specify and want to see what it comes up with, Epsilon seems more cheerful and Zeta looks more serious, though it's a matter of taste. I run away from the DreamShapers and ReVAnimated lookalikes because they're too serious on their default expressions.

ZootVision is basically how I imagined SD1.5v2 to be like and whenever I try a prompt I always think "hmmm, I wonder what ZootVision would do with it?"

Excellent job!

ZootAllures9111

Author

Jul 5, 2024· 1 reaction

Thanks for the comment, I really appreciate it! A comparison image post with like what you mean exactly for faces between the versions would be helpful BTW if you felt like doing one.

ZootAllures9111

Author

Jul 5, 2024· 1 reaction

One other thing, this as I've always said isn't supposed to have any "set in stone" style (such as 3d) at all, that's why I always post extremely different looking generations directly next to each other in the gallery lol. You can look at any of my prompts to get ideas on how to control the overall look, like TLDR it's not supposed to be (and isn't) "just" an anime or realistic or semirealistic model.

People seem to have difficulty understanding that it is in fact possible and not even hard to make a model do all of those things at once as long as you tag your images properly, I guess, or at least that's how it seems.

ZenythJul 16, 2024

@diffusionfanatic1173 Thanks! Yeah, I can't show you what I mean because civitai flags my images, but it's a simple prompt like "Cartoon pretty cute young girl 'doing something', 'with some background'" where ZootVisionEpsilon would give me cheerful looking outputs that I love a lot while ZootVisionZeta would produce more serious looking expressions or the mouths open, which, I guess it makes sense to have neutral expressions as the default, and of course I can always modify the prompts to make them cheerful, but Epsilon gives me what I want already, it reminds me of the default cartoon style of Dalle 3 where everyone looks high on sugar, lol! It's definitively a much better model than BetterPony, though I haven't explored much the score_9 tag.

Do you have plans for an Eta version? There have been recent models released and I would love to see their styles implemented in ZootVision, like "by AuroraFlow", "by Kolors", or "by PixartSigma" or even "by Playground", you have proven a single model can do it all and the way SD1.5 based models do styles is still my favorite, so I'm at the edge of my seat about your work!

ZootAllures9111

Author

Jul 7, 2024· 1 reaction

CivitAI

I've become somewhat frustrated at the fact that people seemingly don't believe that ALL of the images I post are from this model with no Loras, like yes you can make a model that does all of those styles, it's not hard. I dunno how many photorealistic gens posted directly beside anime gens it's gonna take to get this through people's heads lol. I also kind of regret doing anything with Pony 1.5 as I feel it's (for no good reason) detracting attention for this, like Pony DOES NOT know any "by artistname" Booru tags, this model does know thousands of them, and so on and so on and so on. Anyways rant over.

contrarianAug 18, 2024· 1 reaction

SD can do anything if you tag the training images right. Pony tagged them poorly and ended up needing a bunch of stupid score_* tags in every prompt to even make decent pictures. You tagged them well, and your model ended up with the best prompt adherence I've seen from any SD 1.5 model, not to mention a unique ability to master many styles at once. Truly outstanding work, and in a fair world zoot would be a much bigger name than pony. But here we are... shit floats to the top, while gold nuggets get flushed down the drain. It never ceases to amaze me how consistently popularity contests fail to reward excellence! If humanity is going to move forward we need to reinvent some kind of meritocratic system where people who can tell the difference between a soggy McDonalds burger and an expertly prepared filet mignon get to write the restaurant reviews!

Anyway, rant over, just wanted to give a thumbs up, you're doing godly work man!

ZootAllures9111

Author

Aug 19, 2024

@contrarian Thanks. I do feel I might have been a bit harsh when I made this comment just out of frustration, but I apprecate your sentiment nonetheless. Will be updating BP 1.5 pretty soon also.

ZootAllures9111

Author

Jul 15, 2024· 1 reaction

CivitAI

I've just uploaded a zip file (as "training data") containing a CSV with every artist tag known to the model that had at least 100 appearances in the baseline dataset that the original V1 Alpha was built on. It's ordered from least to most appearances, with the number beside each tag. Note this isn't an exhaustive list by any means, there's actually thousands more weaker tags with under 100 appearances.
You'll want to write them exactly as they are written in the file, that is, spaces not underscores, and correctly escaped round brackets.

Also, you might want to NOT use "masterpiece" in your positive when using artist styles - it is likely to overpower them.

ZootAllures9111

Author

Jul 21, 2024· 1 reaction

CivitAI

Poll: do people actually care about "futa" enough for me to delay V7 for a little bit to enhance it? Full disclosure it's not something I've ever given much though to, like I dunno, I'm 32 and I don't remember it being that much of a thing at all even like ~10 to 15 years ago lol

boodilybooJul 29, 2024

My 2c... yuri related tags, poses, especially all configurations of laying down and upside down have wider appeal and make the model stronger in general? Plus training is less likely to make it draw sausages in all the wrong places!

ZootAllures9111

Author

Jul 29, 2024

@boodilyboo is there like anything in particular you feel is lacking ATM? And are you talking about 2d or realistic gens, cause the NSFW stuff works for both (at least mostly)

Cum_MiserAug 10, 2024

It's kinda popular. But most futa checkpoints usually too specialized and very rigid (one trick pony situation).

ZootAllures9111

Author

Aug 11, 2024

@Cum_Miser I've trained v7 already without doing any more Futa, but I did kinda experiment with putting together a small dataset for a future version that should help it a bit without breaking prompt adherence for other stuff

Checkpoint

SD 1.5

by ZootAllures9111

Download (Beta) View on CivitAI