CivArchive
    Preview 55254860
    Preview 55254874
    Preview 55254866
    Preview 55254870
    Preview 55254872
    Preview 55254873
    Preview 55254869
    Preview 55254871
    Preview 55254956
    Preview 55254952
    Preview 55254958
    Preview 55254953
    Preview 55254960
    Preview 55254955
    Preview 55254954
    Preview 55254959
    Preview 55255010
    Preview 55255012
    Preview 55255007
    Preview 55255009

    In depth retraining of Illustrious to achieve best prompt adherence, knowledge and state of the art performance.

    Big dreams come true

    The version number is just an index of current final release, not a fraction of the planned training.

    HF repo

    Large scale finetune using gpu cluster with a dataset of ~13M pictures (~4M with natural text captions)

    • Fresh and wast knowledge about characters, concepts, styles, cultural and related things

    • The best prompt adherence among SDXL anime models at the moment of release

    • Solved main problems with tags bleeding and biases, common for Illustrious, NoobAi and other checkpoints

    • Excellent aesthetics and knowledge across a wide range of styles (over 50,000 artists (examples), including hundreds of unique cherry-picked datasets from private galleries, including those received from the artists themselves)

    • High flexibility and variety without stability tradeoff

    • No more annoying watermarks for popular styles thanks to clean dataset

    • Vibrant colors and smooth gradients without trace of burning, full range even with epsilon

    • Pure training from Illustrious v0.1 without involving third-party checkpoints, Loras, tweakers, etc.

    There are also some issues and changes compared to the previous version, please RTFM.

    Dataset cut-off - end of April 2025.

    Features and prompting:

    Important change:

    When you are prompting artist styles, especially mixing several, their tags MUST BE in a separate CLIP chunk. Just add BREAK after it (for A1111 and derivatives), use conditioning concat node (for Comfy) or at least put them in the very end. Otherwise, significant degradation of results is likely.

    Basic:

    The checkpoint works both with short-simple and long-complex prompts. However, if there are contradictory or weird things - unlike with others they won't be ignored affecting the output. No guide-rails, no safeguards, no lobotomy.

    Just prompt what you want to see and don't prompt what shouldn't be on the picture. If you want to have a view from above - don't put ceiling into positive, if you want to have crop view with head out of frame - don't make detailed description of character facial features, and so on. Pretty simple but sometimes people are missing it.

    Version 0.8 comes with advanced understanding of natural text prompts. It doesn't mean that you are obligated to use it, tags only - completely fine, especially because understanding of tags combinations is also improved.

    Do not expect it to perform like Flux or other models based on T5 or LLM text encoders. The whole size ot SDXL checkpoint is less then only that text encoder, in addition illustrious-v0.1 which is used as the base completely forgot a lot of general things from vanilla sdxl-base.

    However, even in current state it works much better, allows to do new things usually impossible without external guidance, as well making manual editing, inpainting, etc more convenient.

    To achieve best performance you should keep track of CLIP chunks. In SDXL the prompt is separated into a chunks of 75 (77 including BOS and EOS) tokens, that are processing by CLIP separately, and only then are concatinating and comes as conditions to unet.

    If you want to specify some features for character/object and separate them from other prompt parts - make sure they are in the same chunk and optionally separate it with BREAK. It will not solve problem of traits mixing completely, but can reduce it improving overall understanding, since text encoders on RouWei are able to process the whole sequence, not individual concepts better then others.

    Dataset contains only booru-style tags and natural text expressions. Despite having a share of furries, real life photos, western media, etc. all captions have been converted to classic booru style to avoid a number of problems from mixing of different systems. So e621 tags won't be understanded properly.

    Sampling parameters:

    • ~1 megapixel for txt2img, any AR with resolution multiple of 32 (1024x1024, 1056x, 1152x, 1216x832,...). Euler_a, 20..28steps.

    • CFG: for epsilon version 4..9 (7 is best), for vpred version, 3..5

    • Sigmas multiply may improve results a bit, CFG++ samplers work fine. LCM/PCM/DMD/... and exotic samplers untested.

    • Some schedulers doesn't work well.

    • Highresfix - x1.5 latent + denoise 0.6 or any gan + denoise 0.3..0.55.

    • For vpred version lower CFG 3..5 is needed!

    For vpred version lower CFG 3..5 is needed!

    Quality classification:

    Only 4 quality tags:

    masterpiece, best quality

    for positive and

    low quality, worst quality

    for negative.

    Nothing else. Actually you can even omit positive and reduce negative to low quality only, since they can affect basic style and composition.

    Meta tags like lowres have been removed and don't work, better not to use them. Low resolution images have been either removed or upscaled and cleaned with DAT depending on their importance.

    Negative prompt:

    worst quality, low quality, watermark

    That's all, no need of "rusty trombone", "farting on prey" and others. Do not put tags like greyscale, monochrome in negative unless you understand what are you doing. Extra tags for brightness/colors/contrast section below can be used

    Artist styles:

    Grids with examples, list/wildcard (also can be found in "training data").

    Used with "by " it's mandatory. It will not work properly without it.

    "by " is a meta-token for styles to avoid mixing/misinterpret with tags/characters of similar or close name. This allows to have a better results for styles and at the same time avoid random style fluctuation that you may observe in other checkpoints.

    Multiple give very interesting results, can be controlled with prompt weights and spells.

    YOU MUST ADD BREAK after artists/style tags (for A1111) or concat conditioning (for Comfy) or put them in the very end of your prompt.

    For example:

    by kantoku, by wlop, best quality, masterpiece BREAK 1girl, ...

    General styles:

    2.5d, anime screencap, bold line, sketch, cgi, digital painting, flat colors, smooth shading, minimalistic, ink style, oil style, pastel style

    Booru tags styles:

    1950s (style), 1960s (style), 1970s (style), 1980s (style), 1990s (style), 2000s (style), animification, art nouveau, pinup (style), toon (style), western comics (style), nihonga, shikishi, minimalism, fine art parody

    and everything from this group.

    Can be used in combinations (with artists too), with weights, both in positive and negative prompts.

    Characters:

    Use full name booru tag and proper formatting, like karin_(blue_archive) -> karin \(blue archive\), use skin tags for better reproducing, like karin \(bunny\) \(blue archive\). Autocomplete extension might be very useful.

    Most characters are recognized just by their booru tag, but it will be more accurate if you describe their basic traits. Here you can easily redress your waifu/husbendo just by the prompt without suffering from the typical leaks of basic features.

    Natural text:

    Use it in combination with booru tags, works great. Use only natural text after typing styles and quality tags. Use just booru tags and forget about it, it's all up to you. To get best performance keep track if CLIP 75 tokens chunks.

    About 4M of images in dataset had hybrid natural-text captions, made by Claude, GPT, Gemini, ToriiGate, then refactored, cleaned and combined with tags in different variations for augmentation.

    Unlike typical captions, these contains character names which is very useful. Better to keep it clean, short and convenient description works best. Better not use long and sloppy BS like

    A mysteriously enchanting feminine entity of indeterminate yet youthful essence, whose celestial visage radiates with the ethereal luminescence of a thousand dying stars, blessed with locks cascading like the golden rivers of ancient mythology, perhaps styled in a manner reminiscent of contemporary fashion trends though not necessarily adhering to any specific aesthetic paradigm. Her eyes, pools of unfathomable depth and hue, sparkle with the wisdom of millennia yet maintain an innocent quality that defies temporal constraints...

    For captioning you can use ToriiGate in short mode.

    And don't expect it to be as good as flux and others, it tries very hard and after several rolls usually you can get what you want, but it is not that stable and detailed.

    Oh yeah

    tail censor, holding own tail, hugging own tail, holding another's tail, tail grab, tail raised, tail down, ears down, hand on own ear, tail around own leg, tail around penis, tailjob, tail through clothes, tail under clothes, lifted by tail, tail biting, tail penetration (including a specific indication of vaginal/anal), tail masturbation, holding with tail, panties on tail, bra on tail, tail focus, presenting own tail...

    (booru meaning, not e621) and many others with natural text. The majority works perfectly, some requires a lot of rolling.

    Brightness/colors/contrast:

    You can use extra meta tags to control it:

    low brightness, high brightness, low saturation, high saturation, low gamma, high gamma, sharp colors, soft colors, hdr, sdr

    Example

    They work both in epsilon and vpred version and works really good.

    Epsilon version relies on them too much. Without low brightness or low gamma or limited range (in negative) it might be difficult to achieve true 0,0,0 black, the same often true for white.

    Both epsilon and vpred versions have like true zsnr, full range of colors and brightness without common flaws observed. But they behaves differently, just try it.

    Vpred version

    Main thing you need to know - lower your CFG from 7 down to 5 (or less). Otherwise, the use is similar with advantages.

    It seems that starting from v0.7 vpred works flawlessly now. It shouldn't suffer from ignorance of tags close to the 75tokens chunk borders like nai. It is more difficult to get burned images - even on cfg7 usually it just over-saturated but with smooth gradients, which can be useful for some styles. Yes it can make anything from (0,0,0) to (255,255,255). You will find brightness meta tags described above quite useful for easier/lazy prompting, natural text expressions also work. To get the most dark image - put high brightness into negative and/or use low brightness, low gamma tags. If you don't like very bright skin on dark background and want to reduce contrast (or on the contrary, enhance the effect) - use hdr/sdr in negative/positive.

    It was reported that in rare cases on some prompts there is a drop in contrast. Looks like other vpred models have same behaviour with such prompts, adding a "separator" closer to the border of the 75-token chunk fixes this. However, with 0.7 I haven't encountered this myself.

    To launch vpred version you will need dev build of A1111, Comfy (with special loader node), Forge or Reforge. Just use same parameters (Euler a, cfg 3..5, 20..28 steps) like epsilon. No need to use Cfg rescale, but you can try it, cfg++ works great.

    Base model:

    The model here has a small unet polishint after main training to improve small details, bump up resolution and others. Hovewer, you may be also interested into a RouWei-Base, which sometimes can perform better at complex prompts despite having minor mistakes in small details. It also comes in FP32, for example if you want to use fp32 text encoder nodes in Comfy, merge it or finetune.

    It can be found in Huggingface repo

    Known issues:

    Off course there are:

    • Artists and style tags must be seperated into a different chunk from main prompt or come very last

    • There may be some positional or combinational bias in rare cases, but it's not yet clear.

    • There are some complaints about few of the general styles.

    • Epsilon version relies too much on brightness meta tags, sometimes you will need to use them to get desired brightness shift

    • Some newly added styles/characters might be not as good and disctinct as they deserve to

    • To be discovered

    Requests for artists/characters in future models are open. If you find artist/character/concept that perform weak, inaccurate or has strong watermark - please report, will add them explicitly. Follow for a new versions.

    JOIN THE DISCORD SERVER

    License:

    Same as illustrious. Fell free to use in your merges, finetunes, ets. but please leave a link or mention, it is mandatory

    How it's made

    I'll consider to make a report or something like it later. For sure.

    In short, 98% of work is related to dataset preparations. Instead of blindly relying on loss-weighting based on tag frequency from nai paper, a custom guided loss-weighting implementation along with asynchronous collator for balancing have been used. Ztsnr (or close to it) with Epsilon prediction was achieved using noise scheduler augmentation.

    Spent compute - over 8k hours of H100 (apart from research and fail attempts)

    Thanks:

    First of all I'd like to acknowledge everyone who supports open source, develops in improves code. Thanks to the authors of illustrious for releasing model, thank to NoobAI team for being pioneers in open finetuning of such a scale, sharing experience, raising and solving issues that previously went unnoticed.

    Personal:

    Artists wish to remain anonymous for sharing private works; Few anonymous persons - donations, code, captions, etc., Soviet Cat - GPU sponsoring; Sv1. - llm access, captioning, code; K. - training code; Bakariso - datasets, testing, advices, insides; NeuroSenko - donations, testing, code; LOL2024 - a lot of unique datasets; T.,[] - datasets, testing, advises; rred, dga, Fi., ello - donations; TekeshiX - datasets. And other fellow brothers that helped. Love you so much ❤️.

    And off course everyone who made feedback and requests, it's really valuable.

    If I forgot to mention anyone, please notify.

    Donations

    If you want to support - share my models, leave feedback, make a cute picture with kemonomimi-girl. And of course, support original artists.

    AI is my hobby, I'm spending money on it and not begging for donations. However, it has turned into a large-scale and expensive undertaking. Consider to support to accelerate new training and researches.

    (Just keep in mind that I can waste it on alcohol or cosplay girls)

    BTC: bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c

    ETH/USDT(e): 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db

    XMR: 47F7JAyKP8tMBtzwxpoZsUVB8wzg2VrbtDKBice9FAS1FikbHEXXPof4PAb42CQ5ch8p8Hs4RvJuzPHDtaVSdQzD6ZbA5TZ

    if you can offer gpu-time (a100+) - PM.

    Description

    Vpred version

    FAQ

    Comments (83)

    IJDEIHFeb 2, 2025· 4 reactions
    CivitAI

    I think this model is underrated. Easier to use than noob, better prompt recognition in complex scenarios as well. Love this, thank you!

    Edit: This is by far the best model that supports the distinctive style of danbooru artists. This is amazingly good; I tested 2000 danbooru artists and compared them with their style.

    Sailor_LunaFeb 6, 2025

    Have you really tested 2K styles or is it a figure of speech?) It requires a lot of time to make such a comparison

    IJDEIHFeb 7, 2025· 2 reactions

    @Sailor_Luna Yes, I literally generated the exact 1999 artist style of Danbooru artists (artists tag page 1 to 100, excluding banned_artist) to test it. And yes, it took a long time.

    Dewal76Feb 2, 2025· 1 reaction
    CivitAI

    the model have forgotten a lot of details specifically the clothes of some characters. This model is going far away from the base Illustrious model. some of my lora are not even working on it.

    Minthybasis
    Author
    Feb 2, 2025

    Could you please list that characters and share some gens? As for loras compatibility - depends, with such a big training this is inevitable.

    Dewal76Feb 3, 2025

    @Minthybasis v0.61 and 0.7 model and black heart from neptunia series and vajra from granblue fantasy. I use WAI Illustrious and you don't normally need to prompt a lot to get the clothes of a character. writing the name and series name is enough. I tried to use a lora to get the clothes more accurate, but the model is way different.

    Minthybasis
    Author
    Feb 3, 2025· 1 reaction

    @Dewal76 Well, prompting at least like 2-4 basic tags for clothing is usually necessary if character is not very popular and model is not overfitted. Overreacting to a single tags (like it is in base illustrious and the majority of aesthetic finetunes) is beneficial and convenient for lazy prompting, but causes a lot of issues and problems when you want to make something different from "default view"/"completely nude", or other unwanted leaks.

    It's a pity if there are problems with these characters, I'll try to check a little later. Thank you for pointing.

    GiangiottoFeb 3, 2025· 5 reactions
    CivitAI

    Feels unstable (0.7 EPS). Using it as a personal merge component.

    When it works I'd say it's superior to Noob. It sticks to the prompt more and the Natural Language helps.

    However it fails to produce coherent outputs more often than Noob. The anatomy either works or it REALLY doesn't.

    Even with simple prompts with nothing conflicting. It can take a lot of restructuring to get just a fraction of usable outputs while Noob will nail the composition 9 times out of 10.

    Minthybasis
    Author
    Feb 3, 2025

    Thank you!

    Could you please somehow share some gens/prompts where anatomy is consistently bad for investigation?

    GiangiottoFeb 3, 2025· 1 reaction

    Here's an example with a simple prompt, using variations of the T-ntrnai3 model by Tonade. These are just raw outputs, so small details don't matter. These are grids but they contain the metadata.
    1- The original model
    https://files.catbox.moe/78ddpe.jpg
    2- Using Noob instead of Illustrious as the base - A couple bad outputs but the rest are stable
    https://files.catbox.moe/u8c4hm.jpg
    3- Using Rouwei 0.7 EPS as the base - It doesn't use the full space of the image for some reason. Anatomy didn't go crazy on these, that usually happened more on 2 character gens.

    https://files.catbox.moe/6rtjzh.jpg
    4- Just base RouWei - Bad Anatomy and 2/8 are completely failed gens
    https://files.catbox.moe/gta3yi.jpg
    The tntr / RouWei mix was obtained by subtracting Illustrious from tntr and adding RouWei in its place with SD-Mecha, using the untouched Text Encoder from RouWei on the result.
    I've made more complex merges that work better, but they're still less stable than the Illustrious or Noob ones.
    Maybe I'm still prompting wrong for RouWei? Since the gens on the base model were the worst in my examples.
    --- Edit
    I tried a gen with no extensions and just basic Euler/Karras and it's noticeably better so maybe this model is just really sensitive to combinations that work well for Illust/Noob?
    https://files.catbox.moe/mmptqz.jpg
    And with a mixed model, still noticeably better
    https://files.catbox.moe/iczvsh.jpg
    Sooo.... Might have been on me, I'll play around with extensions and the samplers more.

    GiangiottoFeb 3, 2025

    Doing some more testing with Euler a / Karras. Yeah I think that was the main issue. It really doesn't like the Sampler I was using.
    Here's a 2 girl test with a merged model using RouWei as the base, much much better.

    https://files.catbox.moe/ina645.jpg

    Minthybasis
    Author
    Feb 3, 2025· 2 reactions

    @Giangiotto Oh sorry, I should have write in description that karras scheduler and some samplers doesn't work here due to changes, like for previous version. Euler a works fine, cfg++ also giving good results.

    GiangiottoFeb 4, 2025· 1 reaction

    @Minthybasis Maybe I wasn't clear, Karras works (for me at least). It was the "Euler SMEA dy" sampler that caused the bad anatomy. Switching that to "Euler" or "Euler a" solved pretty much all the issues I had, anatomy is at least as good as Noob with them.

    DevilSShadoWFeb 4, 2025· 2 reactions

    @Giangiotto was using the same sampler (and variations of) at the start and I can confirm I was getting garbled messes (and had problems when upscaling to 2x because image contrast would change drastically). I also have issues with the Karras scheduler. Have switched to Euler a/SGM Uniform (as is generally recommended for illustrious-adjacent models) and have had no problems with composition integrity and in the rare cases where my prompts are a bit overboard and I start getting deformed monstrosities, I switch PAG on and that fixes everything.

    Sensei6666Feb 15, 2025
    CivitAI

    ?为什么在WEDui上出的全是噪点图,是我漏了什么重要信息吗

    Why are all the noise graphs on WEDui? Am I missing something important?

    this is a data from my sample

    Positive Prompt

    by amakusa \(hidorozoa\), masterpiece, best quality, 2girls, kurokami fubuki, fox girl with black hair and red eyes, sitting in armchair and reading a book, she is happy and calm, her leg are spread revealing striped panties, shirakami fubuki, fox girl, white hair, aqua eyes, wearing black underwear, she looks into the book and is very surprised, indoors, living room

    反向提示词

    worst quality, low quality, watermark

    迭代步数step: 23

    采样方法: Euler

    调度类型 (Schedule type):Automatic

    提示词引导系数 (CFG Scale):4.5

    随机数种子 (Seed):3296871862

    面部修复:CodeFormer

    尺寸:1016x1024

    model hash:66076a003a

    模型: rouwei finetune of Illustrious_v07Vpred

    VAE hash: 63aeecb90f

    VAE: sdxlVAE_sdxlVAE.safetensors

    CLIP 终止层数: 2

    版本: v1.10.1

    Minthybasis
    Author
    Feb 15, 2025· 1 reaction

    If you're getting noise blobs with vpred model instead of normal images - you need to update the software. A1111 webui requires dev branch, others like forge, reforge, comfy, etc. - just update to latest version.

    Also clip skip 2 is not needed, 1 is fine, and it is better to have a resolution multiple of 32 (like 1024x1024 instead of 1016x1024).

    Sensei6666Feb 16, 2025· 1 reaction

    @MinthybasisI get it, thankyou! 谢谢你!

    UberlordFeb 21, 2025· 1 reaction
    CivitAI

    I think this model can be considered SOTA for prompt following for the SDXL architecture as of 2025-02-21.

    The only issue I have found so far is that it seems to have problems with `dark-skinned female` for some reason. Maybe bad tagging quality in danbooru or the dataset?

    Minthybasis
    Author
    Feb 22, 2025

    Thank you.

    Yes, for dark-skinned female there are a lot of pictures with actually lighter skin. Extra tags and weights usually help.

    Also used style affects it a lot, with some it can be difficult with some while with other - pretty good results achievable with no effort. Marked for improvement in future versions.

    UberlordFeb 24, 2025· 1 reaction

    @Minthybasis I appreciate your response, that helps to better understand and work with your model.

    Thank you and looking forward to the next version!

    alternative_UniverseFeb 22, 2025
    CivitAI

    Is there a way to avoid the same problems of noobai? I'm getting a lot of saturation or dark images

    Minthybasis
    Author
    Feb 22, 2025

    Basically - lower cfg, 3..5 works fine. Aso use tags low saturation on positive and high saturation in negative can solve almost anything.

    Usually it's quite difficult to burn it like noob, could you please upload some pictures with metadata?

    @Minthybasis sure ,I will be uploaded some of them,I just copy the info in your images,but I will be uploading some

    @Minthybasis done,all at 4.8 cgd Wich was the one you used

    Minthybasis
    Author
    Feb 25, 2025· 1 reaction

    @P_Universe Hm, for your images using tags low saturation/high saturation can help https://files.catbox.moe/mtdojl.png .

    But it looks like the main source of that behavior is conflicting prompts. You usinghigh brightness in negative to make image darker, but at the same time dark, darker,dark theme, darkness,black theme, dark background are also in negative. So, the model makes overall brightness low, but tries to avoid black scenes making it dimmed and highly saturated.

    @Minthybasis first thank you for taking the time to do the analysis

    Second,damn that's embarrassing lol, yeah the images of the left are all good

    Could you take a look at the rest?I post some more with the same issue

    Again thank you, can't wait for more uptades

    Minthybasis
    Author
    Feb 26, 2025

    @P_Universe As I can see, other pictures share similar prompt parts. Actually it can be used on purpose to get saturated and dimmed pictures, some people like it, but to avoid it it's better to change prompt a bit or add extra tags.

    Or you're talking about some exact images I've missed?

    @Minthybasis well I uploaded multiple images, maybe the same problem with all of them, I will take out every saturation stuff in the prompt and upload again new images

    By the way beside Euler a , what scheduler you recommend? ,sgm,beta?

    Minthybasis
    Author
    Feb 27, 2025· 1 reaction

    @P_Universe At first look - yes, same, but I don't know what exactly you wanted to get, may be just adding meta tags for saturation will be fine, or refactoring of whole prompt needed. Actually the model should be capable of generating very dark and atmospheric stuff.

    For scheduler - default auto (normal probably?) is fine, but you can try different (except karras and beta) and choose any you like.

    @Minthybasis much appreciated,will test now

    moon_rabbitFeb 22, 2025
    CivitAI

    Have a fresh install of webui. I can't manage to get anything but random blocks of colors with noise when I hit generate; All parameters are default. I'm using the dev branch as you suggested to another use but it didn't fixed it. I'm using the prompt from your image: https://civitai.com/images/55254866
    Is there anything else I need to install for it to run properly? I still didn't install ControlNet. Tried running the model on KritaAI but it crashed. Thanks in advance!

    Minthybasis
    Author
    Feb 22, 2025· 2 reactions

    To use vpred models you need a dev branch of a1111. Type git switch dev in terminal from weubui folder. With other ui like forge/reforge/comfy it runs out of box.

    moon_rabbitFeb 23, 2025

    Thank you, it's working now!

    Shio_NFeb 25, 2025· 1 reaction
    CivitAI

    It's literally the best IL model in terms of creativity... Not sure how you did it. Good work.

    CuauhtemocI5MALFeb 26, 2025· 1 reaction
    CivitAI

    One thing I like of NoobAI is the fact I can use e621 and danbooru tags.

    Including that feature in a future version of this model is going to be mandatory if someday you have intentions of made this model a worthy competitor to Noob and Illustrious.

    Minthybasis
    Author
    Feb 27, 2025

    That's interesting, what 621 tags exactly do you use?

    Strait mixing of two different tagging system in dataset without special approach can cause several issues. People claiming that adding of e621 in dataset of NoobAI gives a negative impact for overall aesthetic, general behaviour and other things. Based on my experience, I think that this statement is close to the truth, some experiments with mixing are clearly showing numerous issues. A careful approach is needed here.

    I'm investigating the best way to introduce e621 data and related concepts (because it obviously will be beneficial) without side effects. Likely there will be some in future versions, but (at least for now) primary purpose of RouWei - anime, not furry.

    As for worthy competitor - sounds quite strange, anyway it's not like we have a lot of real large finetunes for now.

    CuauhtemocI5MALFeb 28, 2025· 1 reaction

    @Minthybasis "Featureless_Crotch" because I mainly use AI to generate characters (Mainly Males) without genitals, and "Featureless_Chest" because sometimes I fell like generate men without nipples.

    If you don't want to add those tags to the dataset, in Danbooru and Gelbooru exists the tags "No_Penis" (Male crotch without genitals), "No_Pussy" (Female crotch without genitals) and "No_Nipples".

    Minthybasis
    Author
    Feb 28, 2025

    @CuauhtemocI5MAL Thank you, these are not popular but important tags.

    swanMar 3, 2025· 1 reaction
    CivitAI

    Oh, I just found out that this is the model of the creator of 4th tail after coming here. wow :) btw this model has such great body proportions, so I’m merging it with other models and it's very nice with it. Looking forward to more updates in later.

    RJeezz_Mar 9, 2025· 1 reaction
    CivitAI

    first vpred model i've really enjoyed using, still the occasional baked looking image issue ive had with others (usually a weights or cfg thing? not really sure), but its like 1 in 10, which i don't mind at all

    WucanMar 20, 2025· 1 reaction
    CivitAI

    Hey there, I've been using this checkpoint and really liking it,

    One weird problem I've run into, though, is that my gens often have the character's forehead completely white, as if being flashed on by a very powerful light source. Not sure if that's part of the lightning problem with the epsilon model. How do I troubleshoot it?

    Minthybasis
    Author
    Mar 20, 2025

    Thank you. It may be because of general style (going to fix it in next version), bias from some parts of the prompt or something else. Could you please upload some examples with gen metadata on catbox or anywhere else?

    WucanMar 20, 2025· 1 reaction

    @Minthybasis Thanks for looking into this! Seems like it's an issue with IL models, I noticed it even in showcase images for WAI ILXL. Examples: https://civitai.com/images/61479828, https://civitai.com/images/61485961, https://civitai.com/images/61479846

    IJDEIHMar 24, 2025· 1 reaction
    CivitAI

    This is an appreciation comment for this model, since I really like how much Rouwei trained in various, balanced ways.

    I'm pretty sure in the base model (that hasn't merged), this is the best natural language recognition ILXL model, including writing texts in the image.

    So when I merge models, I really love to use Rouwei since it boosts natural language prompts a lot compared to noob ai or illustrious alone.

    One downside of the vpred 0.7 version is its weakness in lighting, especially with a pitch-black theme (even if the sample image shows the black), compared to noob ai. Besides that, it's very good, solid, and my favorite.

    VecthralMar 25, 2025· 2 reactions
    CivitAI

    Great model, currently my most used. Clear meta tag planning, and what impresses me most is: everyone should learn how you write page descriptions.

    frozensquareMar 26, 2025· 3 reactions
    CivitAI

    model is very flexible and knows a lot of things. However, I find it frequently struggling with hands, even more so than other models. Might be as a result of being trained on too many different kinds of hand? Will try a bunch of hand LoRAs but it's looking pretty bad

    frozensquareMar 26, 2025

    After some brief testing, the illustrous stabilizer seems to work decently. I still need to figure out which version works better, the 1.23 version seems to have a bit less bleed than the 1.72 one, which is confusing. I'm also trying out a hand lora in conjunction with that but it's a bit too strong for the whole image and too weak for the hand itself.

    frozensquareMar 26, 2025

    Looking at some of the other images posted here, seems like it's just very influenced by style. Maybe I grabbed a few artists who did hands really weirdly. I'll eventually try on more artists.

    Minthybasis
    Author
    Mar 27, 2025

    This is a base model with a very broad dataset coverage, but no extra alignment except very small polishing. It as supposed to be influenced by the artist/general styles, just like you mentioned.

    Loras, tweakers and other things will also help, general purpose from illustrious mostly works great. Also there are merges based on rouwei that improve some aspects.

    QH96Mar 28, 2025· 9 reactions
    CivitAI

    This is really cool. I'm surprised I hadn't heard of it before. I newly discovered it because of the new WAI merge. I thank you for your hard work.

    kiki123Apr 1, 2025· 2 reactions
    CivitAI

    Hello, by ringeko-chan there is a watermark

    Minthybasis
    Author
    Apr 2, 2025

    One of the most difficult and annoying, to be fixed in next.

    kiki123Apr 2, 2025

    @Minthybasis respect!

    Tozi_WhiteApr 2, 2025· 1 reaction
    CivitAI

    Is there a list of all supported characters with all new somewhere?

    Minthybasis
    Author
    Apr 2, 2025· 2 reactions

    Here is a list of characters in dataset for v0.7 with a sufficient for learning count. But the result depends not only on quantity (even with loss-scaling), so no way absolutely all of them will work well.

    VolnovikApr 6, 2025· 2 reactions
    CivitAI

    Well, waiting for 1.0 v-pred release to build a detailer specifically finetuned for it.

    Also, please update description, ForgeUI arguably works better with v-pred, I have guides specifically for it.

    Minthybasis
    Author
    Apr 7, 2025

    Thank you. Next likely will be 0.8 (1.0 only in case of significant changes). Version here is not a fraction of the total progress or somth., just sequential numbers associated with datasets and approaches.

    VolnovikApr 7, 2025

    @Minthybasis you should explain it properly, otherwise people will just sleep on it. People were tired with Noob releases, then with illustrious stream of underbaked finetunes and tend to avoid stuff like that.

    Minthybasis
    Author
    Apr 7, 2025· 1 reaction

    @Volnovik Yes, indeed. I have been training different sdxl models for over a year and the first releases started with v0.1, gradually updating. Apparently, additional description should be added.

    bl4ckfuture107May 6, 2025

    Forge doesn't "arguably" work better, there is no proof of that, and your guides are heavily biased by loras, so they are not reliable.

    VolnovikMay 6, 2025

    @bl4ckfuture107 huh? what loras? I used one at 0.6 maximum. And dropped it now. I have comfy workflows guide with full explanation of what and why. I fixed f-ing coloration of noob vpred and will release this lora soon. I have all the proof and you dont even have a profile picture.

    DEgaurdianApr 7, 2025· 3 reactions
    CivitAI

    A lot fox-girls in the examples.

    Based.

    GimmeramApr 7, 2025· 1 reaction
    CivitAI

    I want to report bluethebone style, always makes the watermark

    OneRingApr 11, 2025· 1 reaction

    and also often adds subtitles, it's just that all his images are like that, once for 1.5 I made a model and cut out all the subtitles and watermarks from all the images, it was torture xd

    randomotakuApr 11, 2025· 2 reactions
    CivitAI

    The (less popular) epred version is somehow surprisingly inventive in the way it tries to incorporate the whole prompt. Anyone using a two step process with upscale - I highly recommend this checkpoint for your first pass.

    stablydiffusingApr 12, 2025· 1 reaction
    CivitAI

    Incredibly good model. Thanks for your hard work and detailed explanation.

    Really nailed the V pred imo. Amazing lighting controls. Very flexible model that follows prompts extremely well. Better than others I've tried.

    Was having some issues with hands and eyes though. They're good but just not as clear/sharp/crisp as I'd like. Maybe I didn't nail the settings yet, but switching to the WAI merge of this model helped me to get that crispness I was looking for.

    Nonetheless amazing work and you put down a very solid base for others to build off of. Thanks again!

    Minthybasis
    Author
    Apr 13, 2025· 1 reaction

    Thank you!

    The checkpoint is a base model that is trained for knowledge, features, behaviour, etc. The goal is versatility and flexibility, no alignment and aesthetic tune was applied - it's all up to users and other authors.

    Fingers and details often depend from used style or combination, also all general ways of picture improvement like tweakers and loras work here. If it's difficult to find approach, or you just don't want to bother with it - merges is a good option.

    Anyway, default style, base performance and other things should be improved in next version.

    DolledUpApr 14, 2025· 3 reactions
    CivitAI

    I know it's best to use the artist tags and create your own style but is there a way to lessen the default style so its more compatible with style loras? Maybe through tags? or is that something that you're looking to change in future versions? I feel it influences lora styles too much compared to other checkpoints.

    Other than that, this is hands down my favourite vpred model, thanks!

    OneRingApr 15, 2025· 3 reactions

    The larger the model (in this case, more than 7 million images), the worse LoRAs trained on data outside the original dataset perform. Ideally, you should find LoRAs specifically trained on this model or increase the LoRA weight—this helps. For example:

    LoRAs trained on Illustrious will have a weak effect,

    LoRAs trained on Noobai will work slightly better,

    LoRAs trained on Rouwei will have the strongest effect.

    Minthybasis
    Author
    Apr 15, 2025· 2 reactions

    You can reduce some base style bias by using less quality tags (like only best quality in positive and worst quality in negative), or removing them from positive and leaving only worst/low in negative. Also, you can try to combine some of embedded styles (not only artists but general like flat colors) with lora.

    Actually, as OneRing said, if the lora you using is made for other model (especially some very different) It can work very weakly, which will cause the basic style to show through. The best case scenario here would be to train on top of the rouwei, there are a number of examples with excellent results, unfortunately this may not always be possible for everyone.

    Yes, in future I'm going to make base style more neutral and flat, but the result can only be assessed once it is ready.

    DolledUpApr 15, 2025

    @Minthybasis Ahhh okay, thank you both for the help, it's appreciated. I'll do some experimenting and see what I can figure out. Looking forward to the next release :)

    OneRingApr 15, 2025· 5 reactions
    CivitAI

    Thank you so much for your hard work! I absolutely love training LoRAs on your models—their overall accuracy and fine details are always unmatched (I still don’t understand why there isn’t a dedicated column on CivitAI for your model, like there is for Illustrious or Noobai).

    I’d like to ask a couple of questions:

    Is version 0.7 v-pred of the model a finely-tuned version? For some reason, it performs better than the base model available on HuggingFace (yes, I’m aware the HF base model is the ‘Epsilon’ version with its own specific settings). Do you happen to have the original base v-pred model for comparison? Or is the model posted here the base version?

    Second question is simpler: When can we expect version 0.8? :3 I’m really looking forward to comparing the versions. Also, roughly how often do you release new models? I understand training takes progressively longer with each iteration, but on average, how much time goes into training such a large model?


    Once again, thank you so much for being a part of the community, it means a lot

    Minthybasis
    Author
    Apr 15, 2025· 1 reaction

    V0.8 - end of April - beginning of May, if everything goes according to plan.

    The training time is not too long, since it is distributed across multiple H100/H200, about a week depending on resources available. What consumes the most time and effort is the dataset preparation. And all related things, including training and implementing extra models for assessment, processing, masking, etc., code development, testing and other.

    Minthybasis
    Author
    Apr 15, 2025· 1 reaction

    About 0.7 vpred - it is trained on top of v0.7 base (epsilon) using a different dataset and some tricks for conversion. So, there are actually no "base" vpred model, it is the same epsilon right after main phase of training. For vpred loras the best choice is just to pick v0.7 vpred.

    Hanekawa01May 17, 2025

    @Minthybasis any info for v0.8? i like v.07 result, looking forward on v.08 :)

    Minthybasis
    Author
    May 17, 2025· 1 reaction

    @Hanekawa01 Baking, over a half is already done. The latest updates on the Discord server.

    bl4ckfuture107Apr 15, 2025· 3 reactions
    CivitAI

    Best model, bar none. Which Scheduler should I use? I'm currently using sgm_uniform.

    Minthybasis
    Author
    Apr 15, 2025· 1 reaction

    Sgm uniform and normal works fine. This is a field for experiments, but some combinations of sampler-scheduler might be broken.

    bl4ckfuture107Apr 25, 2025

    Thanks, sgm_uniforms seems by far the best, with Euler A at least. Can't wait for 0.8! I've read you're going to improve base style?

    QHvI7vwtWDApr 19, 2025· 2 reactions
    CivitAI

    Any instructions on how to use the v_pred model on ComfyUI? I tried using ModelSamplingDiscrete and RescaleCFG but the images still turn out weird.

    Minthybasis
    Author
    Apr 20, 2025

    In modern versions of comfy it should run out of box detecting sampling mode from state dict flags. Just like any other sdxl model.

    Cfg rescale is not mandatory (in case if you're using it to avoid burning and glitches), but also usable.

    lizarbMay 24, 2025
    CivitAI

    any chance of a HiDream fine-tune with this dataset? 🤔

    Minthybasis
    Author
    May 25, 2025

    Well yes... Donations can significantly speed up the development.

    Checkpoint
    Illustrious

    Details

    Downloads
    2,377
    Platform
    CivitAI
    Platform Status
    Available
    Created
    2/2/2025
    Updated
    7/5/2026
    Deleted
    -

    Available On (1 platform)

    Same model published on other platforms. May have additional downloads or version variants.