Kirazuri Lazuli (Noobai V-Pred)
This checkpoint trained, full-finetune of NoobAI-XL (NAI-XL) V-Pred 1.0-Version is a personal project trained locally on a 4090 with dataset of 15,069 images totaling 378,550 steps trained from the base for ~370 GPU hours at 3.52s/it.
It focuses on adding additional knowledge since the data cutoff of the base model (2024/10/24), including styles, concepts, and characters from anime, video games, and virtual youtubers.
Usage - Important
This model is trained from NoobAI-XL (NAI-XL) V-Pred 1.0-Version, which is implemented as a v-prediction model (distinct from eps-prediction), it requires specific parameter configurations.
It is recommended to familiarize with the base model and its usage instructions when using this checkpoint.
The training objective is mostly to extend the base models knowledge without significantly changing the usage, or degrading the existing knowledge.
What follows are my personal settings, your preferred settings for the base model should be mostly transferrable.
For samplers, Euler for generation, Euler Ancestral for upscaling/inpainting.
(⚠️Other samplers may not work, including some CivitAI defaults such a Karras.)
Previews are generated with a ComfyUI workflow using DynamicThresholdingFull, Upscaling, and FaceDetailer.
DynamicThresholding (CFG-Fix) settings used with a CFG of 10:
dynthres_enabled: True, dynthres_mimic_scale: 7, dynthres_threshold_percentile: 1, dynthres_mimic_mode: Half Cosine Down, dynthres_mimic_scale_min: 1, dynthres_cfg_mode: Half Cosine Down, dynthres_cfg_scale_min: 3, dynthres_sched_val: 1, dynthres_separate_feature_channels: enable, dynthres_scaling_startpoint: ZERO, dynthres_variability_measure: STD, dynthres_interpolate_phi: 1reForge or Forge should also be usable.
*To be automatically detected as a v-pred model in Forge/reForge, added znstr and v_pred keys to the state dict of the model using this script.
Recommended prompt structure:
Quality modifiers masterpiece, best quality, very aesthetic should be positioned at the end of the prompt.
Artist names can be prefixed with artist: to prevent token bleeding with artist names and concepts.
A1111 schedule prompting syntax is used in ComfyUI through the comfyui-prompt-control extension to combine artist styles, i.e: artist:[artist1|artist2|artist3]
In some cases Regional Prompting is used with Attention Couple (example).
Positive prompt:
{{characters}}, {{copywrites}}, {{artists}},
{{tags}},
absurdres, masterpiece, best quality, very aestheticTraining details
The kohya-ss/sd-scripts training configs used can be found on github.
v2.1
This version now includes training on an updated personal aesthetic finetuning dataset (masterpieces), and some recent characters, outfits, and styles:
Dataset cutoff: 2025/07/15
Continued training from v2.0
Training images: 1,004
Regularization images: 314 (Generated from v2.0)
Optimizer: Adafactor
Training precision: Full-fat fp32
Batch size: 4
U-Net LR: 6e-6
TE LR: 2e-6
Epochs: 50
Steps: 25,950
v2.0
This version now has a much better representation of all the characters, concepts, and styles I hoped to train for this checkpoint.
Single training run on the full dataset, expanded with more recent data:
Dataset cutoff: 2025/06/13
Training images: 14,065
Regularization images: 7056 (Generated from NoobAI-XL (NAI-XL) V-Pred 1.0-Version)
Optimizer: Adafactor
Training precision: Full-fat fp32
Batch size: 4
U-Net LR: 6e-6
TE LR: 2e-6
Epochs: 50
Steps: 352,600 (~344 GPU hours at 3.52s/it)
v.1.1
Iterative checkpoint training approach inspired by PixelWave.
This involved training in dataset batches of ~1200 images, for 10 training sessions, before finishing with an 11th aesthetic finetune dataset of 267 images.
Dataset cutoff: 2025/05/25
Adafactor optimizer
Full-fat fp32 training precision
Batch size and LR were adjusted multiple times
Batch size 4, LR 6e-6 seemed most stable
TE trained for the 10th and 11th training sessions at Batch size 4, LR 2e-6
Regularization dataset generated from the 10th checkpoint used in the final aesthetic training to preserve the previously learned characters
List of new series/characters trained:
More previews for trained concepts shared on Version 2.0
anime:
dandadan
dr. stone
gachiakuta
girumasu
gundam gquuuuuux
kaijuu no.8
kaoru hana wa rin to saku
kusuriya no hitorigoto
solo leveling
sono bisque doll wa koi wo suru
witch watch
yofukashi no uta
video-games:
elden ring nightreign
metaphor: refantazio
monster hunter wilds
fate/go (lilith)
genshin impact (citlali, escoffier, lan-yan, varesa, xilonen, yumemizuki mizuki)
honkai star rail (aglaea, castorice, cipher)
wuthering waves (carlotta, cartethyia, chisa, ciaccona, zani)
zenless zone zero (astra-zao, cipher, ju-fufu, luciana de montefio, pulchra fellini, sweety, trigger, vivian-banshee, yi xuan)
hololive:
flow glow (isaki riona, kikirara vivi, koganei niko, mizumiya su, rindo chihaya)
hoshimachi suisei (11th, caramel-pain, kireigoto, spectra-of-nova, supernova)
himemori luna (7th)
houshou marine (ahoy pirates)
natsuiro matsuri (jersey maid)
nekomata okayu (personya respect)
ookami mio (8th)
oozora subaru (police)
roboco san (oriental)
shirakami fubuki (fbkingdom)
usada-pekora (10th)
indie v-tubers:
amagai ruka
dooby
nimi nightmate
sameko saba
yuuki sakuna
other:
myaku-myaku (expo2025)
List of concepts trained:
clothing:
ancient greek clothes
chronopattern dress
jirai kei
water dress
holonatsu paradise (outfits)
concepts:
fourth wall
star trail
flower field
mechabare
monster girl
year of the snake
List of recommended style control tags:
Some intentionally tagged/curated style triggers, from 103 artist datasets:
blending
flat color
no lineart
impasto
painterly
chiaroscuro
impressionism
ink wash painting
pastel colors
pencil art
neon palette
dark
colorful
Traditional media group tags are also trained:
(some not supported by enough data)
traditional media
acrylic paint \(medium\)
ballpoint pen \(medium\)
brush \(medium\)
calligraphy brush \(medium\)
charcoal \(medium\)
colored pencil \(medium\)
color ink \(medium\)
crayon \(medium\)
gouache \(medium\)
graphite \(medium\)
ink \(medium\)
marker \(medium\)
millipen \(medium\)
nib pen \(medium\)
oil painting \(medium\)
painting \(medium\)
pastel \(medium\)
watercolor \(medium\)
Recognitions
Thanks to Laxhar Lab for the NoobAI-XL (NAI-XL) V-Pred 1.0-Version base model.
Thanks to narugo1992 and the deepghs team for open-sourcing various training sets, image processing tools, and models.
Thanks to kohya-ss for the sd-scripts trainer.
License
No modifications are made to the base model Noobai License, which is as follows:
This model's license inherits from https://huggingface.co/OnomaAIResearch/Illustrious-xl-early-release-v0 fair-ai-public-license-1.0-sd and adds the following terms. Any use of this model and its variants is bound by this license.
I. Usage Restrictions
Prohibited use for harmful, malicious, or illegal activities, including but not limited to harassment, threats, and spreading misinformation.
Prohibited generation of unethical or offensive content.
Prohibited violation of laws and regulations in the user's jurisdiction.
II. Commercial Prohibition
We prohibit any form of commercialization, including but not limited to monetization or commercial use of the model, derivative models, or model-generated products.
III. Open Source Community
To foster a thriving open-source community, users MUST comply with the following requirements:
Open source derivative models, merged models, LoRAs, and products based on the above models.
Share work details such as synthesis formulas, prompts, and workflows.
Follow the fair-ai-public-license to ensure derivative works remain open source.
IV. Disclaimer
Generated models may produce unexpected or harmful outputs. Users must assume all risks and potential consequences of usage.
Description
Trained on NoobAI-XL (NAI-XL) V-Pred 1.0-Version
Dataset cutoff: 2025/05/25
reForge or Forge should also be usable as resolved from version 1.0 (apologies if you ran into issues with that version).
*To be automatically detected as a v-pred model in Forge/reForge, znstr and v_pred keys are added to the state dict of the model using this script.
FAQ
Comments (30)
downloading now, can't wait to try it, your models are amongst the best and stay useful longer than pretty much anyone else's
Thank you, I appreciate it
@motimalu got it downloaded but it keeps spiting out pixeled messes with no coherence, I'm using 1.1 on forge
@stygianwizard42 could you try using this images metadata? https://civitai.com/images/79448784
(generated with Forge)
Sorry actually downloading to check it, CivitAI version seems to not have included the fix, trying removing the unpublished v1.0 and deleting this model pages version before re-uploading.
Could have been a conflict with the model hash (since they are mostly identical)
Re-uploaded and checked the version on CivitAI is matching my local fix now including the v_pred and zsntr keys in the model state dict.
Hope that should resolve it and sorry for your wasted bandwidth - if you are familiar with sd-scripts and the command line you can patch the local model yourself with the script I used
incredible!!!so good!
What cfg settings and cfg rescale settings are recommended?
Your preferred settings used for NoobAI-XL V-Pred should be fine, I did not update what I usually use at least:
CFG 10 with DynamicThresholding (CFG-Fix):
Amazing work on this model! Also I have several questions about your training methodology:
Training Sessions & Continuity: You mentioned conducting 10 training sessions with ~1200 images each, plus an 11th aesthetic finetune. Could you clarify whether each "training session" represents a complete stop and restart of training, or did you save training states and continue with adjusted parameters? I'm trying to understand if these were separate fine-tuning runs or continuous training with checkpoints.
Regularization Dataset: You emphasize using a "well structured regularization dataset," but there's limited detail about this approach. Could you elaborate on:
-What constitutes a "well structured" regularization dataset in your context?
-Do regularization images require captions or tags, and if so, how do you structure them?
-How does this differ from typical class token approaches (like "man"/"woman") that seem less applicable to anime content?
Generating Regularization Images: For creating the regularization dataset:
-Is there a golden rule for the number of regularization images needed? Is it proportional to your training dataset size?
-What type of images do you generate - basic prompts like "1boy, 1girl" or specific characters that the model needs to remember?
-When generating these regularization images, do you use quality tags ("masterpiece, best quality") and hiresfix, or do you keep them as raw, unenhanced generations?
Text Encoder Training Strategy: Regarding your note about "training the text encoders when introducing novel character trigger tokens":
-Were these novel characters introduced progressively throughout training, or were they present from the beginning?
-If characters were introduced later, was text encoder training only enabled for those specific sessions?
-For characters already known by the base model but being retrained with new data, would you still recommend enabling text encoder training?
Tagging and Prompt Structure: Given your recommended prompt structure with quality tags ("masterpiece, best quality, very aesthetic") at the end:
-Did you include these aesthetic tags in your training image captions?
-If so, did you manually rate each image for these quality descriptors?
-Does tag shuffling during training affect model performance, and does your training tag structure mirror the recommended prompting structure?
Thank you!
As might be clear from my answers here, I was learning about the things you ask about as I progressed with training this checkpoint.
Different approaches and more care appear to be needed when training the full checkpoint weights.
- Each "training session" is referring to a complete stop and restart of training with adjusted parameters and datasets.
- A "well structured" regularization dataset in my context was one which includes generations from the previous checkpoint and including the concepts/characters which were trained.
- The regularization images included tags, using the tags of the previous dataset images minus tags which I did not care to preserve.
- I did not use the class token approach of "man"/"woman", and instead used .txt file with the full tags.
- The number of regularization images should be equal or fewer than the total training dataset size, I assume more is better.
- I generated the regularization images in bulk from the training dataset tags (minus some unwanted tags), if the image was tagged with "masterpiece, best quality", then the regularization image also would have received it in the tag file.
- No hirez-fix was used, but I curated the regularization dataset to only include only those without obvious faults or unwanted concepts - images with qualities that I wanted to preserve.
- Novel characters were trained more towards the final training sessions, the order matches the "List of x tags trained" sections I've added to the model card.
- TE training was introduced for the final two training sessions, in retrospect the reverse would be preferred.
- To train novel characters first with TE and then focus on stabilizing styles, other concepts, and finally quality/aesthetics without TE.
- I manually rated each image for the quality descriptors and included these in the training, the final training session uses an updated version of the dataset I used to train this LoRA: https://civitai.com/models/929497
- I did not use tag shuffling - the positioning the quality modifier prompts at the end of the prompt should be preserved and my training tag structure mirrors the recommended prompting structure.
- I am not sure of the effect of tag shuffling on model performance, I do not generally use it but theoretically it should improve generalization.
- I am considering trying it but may need to make some modifications to either sd-scripts or my dataset tags to preserve the quality modifier tags position at the end of the prompt.
- (there is no option like keep_last_tokens and this can instead apparently be achieved with a keep_tokens_separator with the separator included in the caption files)
@motimalu Thanks for the detailed explanation! I'd like to understand more about your quality descriptor approach:
-Do you use both positive quality tags ("masterpiece, best quality, very aesthetic") and negative quality tags (like "worst quality, low quality") in your training dataset?
-How extensive is your quality tag vocabulary - are there additional quality descriptors beyond the three you mentioned, and if so, what's the full range you work with?
-Is there a specific structure or hierarchy to how you apply these quality ratings?
@Rindo192 No problem!
Yes I've also used negative quality tags in my training datasets, the full list I use now is "masterpiece, best quality, worst quality, low quality, very aesthetic".
The order of those tags matches the above in use.
Because I have limited compute for full checkpoint training, I've excluded images I would rate as "worst quality, low quality" for this checkpoint, but in larger datasets I include them sparingly for images with rare concepts/characters.
nice model.
but prompts for multiple character(like 2girls+1boy)are not followed well,always combine two girl characters together.
Thanks!
During this datasets curation I avoided multi-person (>2) images to improve learning of novel characters in isolation, multi-person prompting would be somewhat degraded as a result.
It's something I can possibly preserve with some adjustments to regularization and training datasets in the future though.
Personally I generally do not attempt multi-person prompting in SDXL without regional prompting (ComfyUI example) or clip breaks.
@motimalu hope to see it soon:) i like this model`s aesthetic very much.
im a lazy guy,always choose to draw card rather than control image accurately XD
I don't know why, but it doesn't seem to be properly recognized as v-pred in the webui dev version
Hello, the A1111 dev branch should also detect v-pred by checking the model state dict for a "v_pred" key - referencing this PR: https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/16567
I've added the "v_pred" state dict key in version 1.1 of this checkpoint, and confirmed it should be detected locally in A1111 with a test generation: https://civitai.com/images/81781967
All I can think of is that you might have a version without the state dict key.
You could add the "v_pred" key to the checkpoint yourself locally, or try downloading the 1.1 version from this model page in that case.
@motimalu It was my fault. I checked the model hash and it was probably the model before it was patched to v_pred
Thank you
@NaGaRi All good, thanks for confirming!
God, I want you to keep updating this model or make an improved variation sooo badly. Great model!
As it happens, I have been working on updating datasets and re-training this checkpoint over the past month or so. So I appreciate your comment! I'll release a new version very soon. :)
@motimalu That's fantastic. Is there any chance that you'll improve or update the Artist Collection - Study? Thank you very much!
@BUSANPERSON Thanks!
Regarding the Artist Collection - Study model,
I have tried to improve the artist style learning for all new LoRA/Checkpoints since that experiment - now I use artist:{artistname} rather than by {aristname}syntax for artist name tagging, which I think had some small improvement for training from noobai v-pred due their indicating a similar approach was used to prevent artist name and concept clashes.
@motimalu That's helpful, thank you for the tip
@motimalu Hey, I'm back and sorry to bother you. I was just wondering how you train your Lycoris with multiple artists! I just started making Loras and I published a first one and I'm eager to learn more.
Have you tagged them separately? Or is there another way to do it?
@BUSANPERSON Hello, yes for training multiple artist styles ensuring they have separate and unique trigger tokens should go a long way towards preventing style bleeding.
i.e. an identifier like "by " or "artist:" followed by the artist name.
@motimalu Thank you again! Props to your large lycoris dataset and how you made it possible! It would be shocking if you did train that with your local, not on civitai.
@BUSANPERSON No problem!
I do generally train locally - this checkpoint included.
For this checkpoint, it is not trained with a lycoris rank adaptation method, and the dataset is not necessarily explicitly tied to any of the training methods.
It is a full-finetune using the kohya-ss/sd-scripts trainer.
The script used for training it can be found here: https://github.com/motimalu/diffusion-training-configs/tree/main/sd-scripts/SDXL/adafactor-01-full-fp32-v-pred
(Just sharing to avoid confusion on that point)
@motimalu Oh, that's just too much kindness. I would love to train locally, like you, but my 8GB VRAM disagrees. At least for what I want to do for my next lora anyway, which is making multiple art style into a single lora or lycoris! I would have to figure out something somehow. Thank you!

















