Raehoshi Illust XL
an enhanced iteration built upon the Illustrious XL model. It aims to elevate the visual style by addressing some of the limitations in the original, such as oversaturation and artifact noise. While these issues are not entirely eliminated, noticeable improvements have been made. The goal is to deliver a more polished, balanced output while staying true to the strengths of the base model.
Why Early Access?
Early access helps keep the project going. I don’t have my own GPU, so all training is done through rented cloud GPUs and that gets pretty expensive. By getting early access, you’re directly supporting the development of my models and helping me keep improving them. If you'd like to support me further, you can also buy me a coffee on Ko-fi! Every bit of help means a lot and keeps the future updates coming.
Recommended setting
Positive prompt :
masterpiece, best quality, very aesthetic, absurdresNegative prompt :
bad quality, worst quality, jpeg artifacts, sketch, bad anatomy, signature, watermarkSteps : 25+
CFG : 5-7
Sampler : euler a or dpm++2m karras (euler for vpred)
Standard resolution :
832 x 1216, 1216 x 832, 1152 x 896, 896 x 1152, 1344 x 768, 768 x 1344, 1024 x 1024High resolution :
1024 x 1536, 896 x 1536, 1536 x 1024, 1536 x 896Hires.fix Setting:
Upscaler : 4x Foolhardy Remacri
Hires step : 10-15
Denoise : 0.1-0.3
Special Tags
Quality Tags:
masterpiece
best quality
good quality
average quality
bad quality
worst quality
Rating Tags:
safe
sensitive
nsfw
nsfw, explicit
Aesthetic Tags:
very aesthetic
aesthetic
displeasing
very displeasing
Training Details
The model was developed using a two-stage fine-tuning process. In Stage 1, new series and characters were introduced into the model. Stage 2 focused on fixing issues and enhancing the overall style for improved output.
Stage 1
Dataset : v1-31k, v2-37k, v3-34k, v4-60k, v5_v5.1-18k, v6-15k, v7-39k, v8-41k, v9-30k, v10-30k with multi resolution
Hardware : 2x A100 80gb, v3, v4, v5, v5.1-2x H100 80gb, v7,v8, v9, v10-RTX PRO 6000
Batch size : 32
Gradient accumulation steps : 2
Learning rate : 6e-6
Text encoder : 3e-6
Epoch : 15
Stage 2
Dataset : v1-2.5k, v2 and v3-2.3k, v4-2.5k, v5-2k, v5.1-1.8k, v6-1.5k, v7-1.7k, v7.1,v8-4.1k, v9-1.9k, v10-2.4k
Hardware : 1x A100 80gb, v7_v7.1,v8, v9, v10-RTX PRO 6000
Batch size : 48
Gradient accumulation steps : 1
Learning rate : 3e-6, v5.1-2.5e-6
Text encoder : disable
Epoch : 15
List of New Series/Characters Trained:
Zenless Zone Zero
Wuthering Waves
Honkai: Star Rail
Genshin Impact
Arknights: Endfield
Umamusume
Azur Lane
Arknights
Fate/GO
Dandadan
Make heroine ga oo sugiru
Kusuriya no Hotorigoto
Hololive from justice and dev is
Indie Vtuber Dooby, Yuuki Sakuna, Nimi Nightmare, and S***
100 girlfriends who really love you
Haite kudasai takamine-san
Alina clover
Nikke: bready and little mermaid
Kpop Demon Hunters
Full character list are available article here
For character trait details prompts, please refer to the Danbooru site for accurate tags and references.
License
Special thanks to Joe for supporting my works
Special thanks to Juno for supporting my works and help me with early tester
Description
Improve stability
Improve anatomy
Fix dupe character/body in some generation
Improve high resolution generation (see recommended highres in the model detail)
Note:
Due to some issues in v5, which seem to stem from the Illustrious v2.0 base, this update has been rolled back to v4, which uses the Illustrious 1.1 base, so this version continues training from v4
FAQ
Comments (33)
Do you have a list along with prompts (including default clothes) for the new characters you've added?
Since my dataset is from Danbooru, you can find all necessary character information there. To create a character prompt, simply search for the character on Danbooru, select an image featuring their default attire, and then copy all available tags directly into your prompt
@Raelina Oh okay so the characters are trained from Danbooru still, thanks for sharing this Checkpoint and training more characters.
I am very much looking forward to testing V5.1. V4 was already working really well and that with the added characters sounds very nice.
I sent you a tip on Ko-FI since i dont do crypto.
Thank you for your support. I have send you massage on ko-fi for the access link
How many characters does this model know?
I've listed all the characters I trained in the model details. The list only includes characters I personally trained, and does not include characters already present in the Illustrious base model.
For example, characters from Hoyo games like Zenless Zone Zero, Honkai: Star Rail, and Genshin Impact have been updated to include those available up to the current version.
Note: Some of the newest characters from the current version may not be 100% accurate in their original outfits, as there's still limited fanart available when I trained it
Pros on v5.1 compared to v4
1. Better detailed,
2. Better ahegao, (idk if this matters, but yea, its weirdly better, not that I hate it, I love it ofcourse)
3. I love how every part has every detail in.
4. Everything looks better except for one part in Cons.
Cons on v5.1 compared v4
Full body on eye detail. Because it tries to cramp in every detail, so some detail on full body is overbloated(on just eyes only), making it looks bad at higher chance. if it does manage, it does looks good, its just it has higher fail output on full body.
VERDICT :
As long as u dont use full body tags, its better than v4 by miles.
Thanks for the detailed review. You can try using ADetailer to fix the eye details in full body generations. It might not be a perfect solution, but it can help improve the eye quality
closely to every model cramps out the eyes in full body with no face fix usage so adetailer on for face is everytime a must if u want to get good results.
Thank you for including the training data! As someone looking to possibly get into training, I had some questions about it.
1. For the v5 dataset, does "18k" mean 18,000 images were used for training?
2. Did you handpick all images in the dataset, or did you make use of any preprepared datasets?
3. Does your dataset include any real-life images?
4. Do you initially use auto-tagging then manually edit tags for each image, or is everything manually tagged?
5. To your knowledge, is the way you train the typical method one would create a trained checkpoint?
Normally I would just test various methods and check the results myself, but since I would need to invest some money for training, I would like to limit unnecessary trial and error. Any advice you could give would be great, thanks!
1. Yes, they most likely used 18k images.
2. The dataset size is the strongest indicator it was hand-picked. When creators don't prune datasets, they typically use 200k to 1 million images and retrain the model completely.
4. Auto-tagging hallucinations are why most fine-tunes are difficult to prompt effectively. You must either manually tag everything or meticulously verify every auto-generated tag.
5. There's no true established standard method - everyone is experimenting. However, the two-step training approach works exceptionally well for teaching new characters since you can focus on the character first and reintroduce style in the second pass.
Training parameters like learning rates are determined through trial and error. Your hardware, base model, dataset quality/size/style all significantly impact results. If renting GPUs, expect to spend days (and budget) just finding the optimal learning rate. This experimental requirement - the absence of definitive guidelines - is precisely why few people fine-tuning successfully.
I myself have been working on a model and have the luxury of training on my own hardware and I've thrown away about 400 hours of training just because i wasn't feeling the vibe of the model, so i could imagine the losses of actually paying to rent a GPU
1. Yes, 18,000 images were used for the first-phase training.
2. My dataset went through preprocessing, I filtered out low-aesthetic images, bad tags, transparent images, and so on.
3. No real-life images were used. The dataset mainly comes from Danbooru.
4. Both, actually. The 18k dataset already had Danbooru tags. For the second training phase, I used auto-tagging.
5. I'm not sure if there's a "typical" method, since every creator has their own approach. But for reference, my method is the same as the one used in Animagine XL, since I was also one of the devs who trained it.
There’s no universal best method, you’ll need to go through trial and error. Full finetuning is expensive, I rent GPUs on RunPod, and that's one of the reasons not many creators do full finetunes. For starters, pick one GPU that you'll consistently use. Set your batch size as high as possible by the GPU that can handle, ideally using 80–90% of the GPU’s VRAM. Learning rate also depends on batch size and whether you're using multi-GPU or not.
As Hysocs mentioned, it really takes a lot of experimenting to dial in the parameters for full finetuning. Don’t be afraid to iterate, it’s part of the process.
@Raelina Thank you for the additional insight. I'm planning to use RunPod too and have done a couple of small-scale LoRA tests to confirm what the process might look like for a full finetune. Based on the info I've gathered so far, I'm going to be spending a large chunk of time just gathering a good number of images for now.
@Hysocs Thanks for explaining. My hardware is decent enough for maybe LoRA training, but realistically, I would need to make use of a rental service like RunPod unless I purchase a separate machine just for training. Honestly, I'm still a bit iffy on how I would like to approach this, or whether it would be worth doing at all, but for now I'll focus on getting the prerequisite steps done and decide from there. Worst case, maybe I'll just create some LoRAs I can share if I end up not going through with a full finetune.
Which version of Illustrious are you using on this checkpoint?
Each version used different base, please read "About this version" for more detail
hmm doesn't mix as well to the style lora I have, but the updated character data is nice! Hyacine with lora came out well!
For some reason I still can't get Natasha from HSR correct at all tho...but the lora out there are also shit so I guess it is what it is...
Thanks for your feedback! I’ll include Natasha from HSR in my dataset for the next update to fix the character issue. If you come across any other characters with similar problems, feel free to let me know so I can improve them in future updates
Always like your model. May I know what is the exact problem of v5.0? I am currently using v5.0 and feel it is pretty good, just want to know if there is anything I have missed
it had some rules on positioning of quality tags. and sometimes ignoring prompt but, its definitely good.
5.1 and 4 for example can use quality tags on either start of prompt or end of prompt.
but for 5.0 quality tag must be used in front, which makes it harder for some people using A1111/ Forge to generate as they need to re-copy quality prompt tag in front instead of automatically UI system-generated on the back
Thank you for liking my model! The issue with v5.0 is that, in some cases, it can generate body doubles, like duplicated heads or cropped heads. It doesn't happen all the time, but it can occur under certain conditions. That’s why I released v5.1 as an alternative for those affected by this problem. If you haven’t encountered it, that’s great! Both versions share the same character knowledge, so feel free to use whichever one works best for you.
@Raelina Thanks for the quick reply! Got it so it is a minor fix. I heard that Illustrious 2.0 trained on ai generated image dataset so it has some potential issues when used as a base model for training. Anyway it is a great model!
Really hope Illustrious can further envolve with better prompt adherence and image quality!
i like these models with a small amount of downloads, unlike the popular models they update frequently and have the newest characters :D (loras dont like it when theres more than one person)
My new favorite model.
Simple prompts give good and varied results, and prompt understanding is also great.
No Ayase Seiko?
it's been there since v2. You can check the showcase image on v2 or v2.1 for an example prompt for Ayase Seiko
Will you do an anime finetune of Qwen Image?
Not sure, I don’t have enough resources to finetune a big model like the 20B parameter Qwen-Image
Raelina what about chroma?
alternative_Universe Probably not, finetuning huge models with 8–20B parameters is super expensive, and I don’t have that kind of money. Maybe if someone sponsors or donates me to help with the development and training costs
Raelina I've seen that there is now a distilled and even lighting version of Qwen, I think I even read that is easier to finetune now, I'm not sure, but just letting u know if u wanna take a look
Raelina got it, thanks for focus on nikkes characters tho🙌🏻









