This is my experiment in creating an SD1.5 merge style model for i2i.I am satisfied with the current state, but I will update it if I feel anything is needed.or I might share ways to enhance this model rather than updating it directly.
■Since the model now has both anime and real versions, the detailed explanations have been moved to each model’s tab.
Both are merged models created by selecting multiple high-quality models with minimal artifacts.
■1024px_model
●Additionally, due to merging with NAI v2 and Dora fine-tuning, the 1024px model has largely solved the problems associated with the 512px model. It features high resolution and much better tag adherence.Therefore, none of the drawbacks typically pointed out regarding SD1.5 apply to this specific model.
Personally, I see it as a game-changer that significantly extends the limits of SD1.5. I strongly suggest giving it a try.
●That being said, even at 1024px, fine details like the eyes are often insufficient. I highly recommend upscaling further using i2i.
●kohya_deep_shrink is also effective for t2i, so it might be a good idea to try using it.
Doing so can sometimes reduce the breakdown of backgrounds and fingers, leading to more stable results.
■512px_model
With the three models—asian, real, and anime—now available, it could be fun to adjust their mix to find your ideal style.
●asian 0.5 + real 0.5 might yield a more mixed, half-and-half look.
●asian 0.5 + anime 0.5 might produce a cute, 2.5D-style appearance.
Feel free to experiment with different ratios.
■Since this is just a merge, it shares a common SD1.5 limitation where NSFW tags may not be fully understood or followed.
I have decided to manage the concept-enhancing LoRA separately.
https://civarchive.com/models/1253884/sd15loralab
Of course, it can be used on its own, It is designed for i2i processing the models below.
https://civarchive.com/models/505948/pixart-sigma-1024px512px-animetune
■Depending on the situation, this extension may also improve colors and contrast.
https://github.com/Haoming02/sd-webui-diffusion-cg
https://github.com/Haoming02/comfyui-diffusion-cg
■Using external tools for level adjustment is also a good option.
Reducing gamma slightly while enhancing whites can improve contrast even further.
Using these should help achieve color rendering closer to that of SDXL.
■Surprisingly, generating at 768px or 1024px sometimes works fine.If you want more stability, merging with Sotemix could help.But since most LoRAs are trained at 512px, high resolutions can break the output.So it’s safer to use highres.fix or kohya_deep_shrink when using LoRAs.
Personally, I prefer i2i upscaling over highres.fix, as it tends to produce fewer artifacts.
■Please feel free to ask if you have any questions!
日本語での質問も大丈夫ですので気軽にお声がけください!
Description
■This is a 1024px merged model for SD1.5.
Pruned Model fp16 (1.99 GB):main_merge_model
Pruned Model fp16 (5.44 GB):another_merge_model
Training Data (7.81 GB):merge_workflow_sample
■Please set Clip Skip: 2.
■The base resolution is 1024px.
The concept of this model is to enable high-resolution generation while preserving the traditional SD1.5 anime style more closely than v1.
Just like v1, the standard resolution for inference is 832x1216.
896x1344 isn't a bad choice either.
Generating at 1024x1536 is also possible, though I feel it lacks a bit of stability.
However, it can still produce some great results at that size.
kohya_deep_shrink is also effective for t2i, so it might be a good idea to try using it.
Doing so can sometimes reduce the breakdown of backgrounds and fingers, leading to more stable results.
■With i2i, you can use almost limitless resolutions; 2048x3072 provides a good balance of speed and quality. Using i2i at this resolution allows for the generation of highly detailed images.
Differences from anime_1024px_v1.0
There are various differences from v1.0.
Roughly speaking, v2.0 is more heavily specialized for anime and sits much closer to the familiar SD1.5 style. It might also be easier to use, even with simple or rough prompts.
The Finer Details:
I came to realize that it's the OUT blocks that primarily determine the style in a merge.
Because of this, I noticed that in v1.0, since I merged the OUT blocks of nai_v02 at a 0.5 ratio, the actual stylistic influence of nai_v02 was quite high.
As a result, v1.0 has a very smooth anime style that leans towards modern aesthetics, which can sometimes feel almost too clean. Consequently, the user experience can feel a bit different from your typical SD1.5.
While nai_v2 is highly flexible, its style and quality change drastically depending on whether or not you use aesthetic prompts.
nai_v1 is similar, but thanks to heavy community merging, it has often been conditioned to forcefully output a good style regardless of the prompt. Therefore, when merging the OUT blocks of these two, the "forcing" effect is reduced by nai_v2's influence, leaving you with a smoother anime style.
If you are highly familiar with how to use aesthetic prompts for both nai_v2 and nai_v1, you can get great results through their synergy. However, if you treat it with a standard SD1.5 mindset, you might not be unlocking its full potential.
How anime_1024px_v2.0 improves this:
For v2.0, I did not merge the OUT blocks of nai_v2.
This means the IN and MID blocks gained the flexibility of nai_v2, but the OUT blocks retain the existing nai_v1 style.
Because of this, it functions like a standard SD1.5 model but with 1024px support and broad tag recognition, all while maintaining the diverse and artistic generation capabilities that make SD1.5 so great.
v1 vs. v2:
Please note that the older v1 is not inferior to v2. They just have different goals.
The goal of v02 is specifically to preserve the traditional aesthetic anime style of SD1.5.
With v1, I was aiming for a more modern look by merging a wider variety of models (such as realistic and anime styles) and training with high-quality AI images. Because of that, v1 might sometimes give you a richer, semi-realistic feel.
Please use whichever fits your preference! Or, merging the two together might be an interesting experiment.
Bonus Models:
As a bonus, I’ve also provided merged models with different ratios.
These do include nai_v2 in the OUT blocks, so their outputs are smoother. In some cases, they might even offer better stability. Feel free to try them out and see what you like.
Resources:
Just like with v01, I am sharing the model assets and the workflow I used for these merges, so please use them as a reference for your own merging projects.
The beautiful style of this merge—both for characters and backgrounds—is largely thanks to his model. So, I just wanted to say a huge thank you here!
https://civitai.com/models/1490223/2d2d
■Using the quality prompts below might help stabilize your output quality. > On the other hand, not using them reduces style fixation, allowing for more flexible generation.
I have set the weight of these quality prompts to 1.2. > This is a combination of the quality prompts from both nai_v2 and nai_v1. > If you feel the effect is too strong, please feel free to lower the weight.
Prompt
(very aesthetic, aesthetic, best quality, amazing quality, masterpiece:1.2), absurdres
Negative prompt
(very displeasing, displeasing, worst quality, bad quality,low quality:1.2), dated, deformed, bad anatomy, disfigured, flat color, lack of texture, flat shading, comic, error, bad, extra, fewer,