CivArchive
    Preview 5949851Preview 5949853Preview 5949856Preview 5949865Preview 5949873

    [Proudly introducing, AnySomniumXL v3, an SDXL Model]

    You can support me on Ko-Fi

    The SDXL model with a 2D (cartoonish) style is trained with the basic SDXL model (SDXL Base v1.0), supported by text encoder training to generate a 2D style with natural language and likely not generate the realistic style inherent in SDXL Base.

    The model is trained with 133,000+ curated images from hundreds of thousands of images from various sources. The dataset is built by saving images that have an aesthetic score of at least 17 and a maximum of 50 (to maintain the cartoonish model and not too realistic. The scale is based on our proprietary aesthetic scoring mechanism), and do not have text and watermarks such as signatures or comic/manga images. Thus, images that have an aesthetic score of less than 17 and more than 50 will be discarded, as well as images that have watermarks or text will be discarded.

    AnySomniumXL v3 Technical Specifications:

    • Training per 1 Epoch 16 Epoch (Results from AnySomniumXL using Epoch 16)

    • Captioned by proprietary multimodal LLM, better than LLaVA

    • Trained with a bucket size of 1280x1280

    • Shuffle Caption: Yes

    • Clip Skip: 2

    • Trained with 2x NVIDIA A100 80GB

    The technology for creating this dataset uses a combination of the CLIP model and MLP scoring method by christophschuhmann and modified by us, utilizing VIT-L/14 to produce aesthetic scoring on a scale of -1-100 and modified with the addition of watermark detection from us.

    Achievements:

    ✓ Produces more 2D Models with Natural Language by default without the need for excessive negative or positive prompts

    ✓ Most likely to produce better fingers than the average stable diffusion model without adetailer or inpainting

    ✓ Produces a more authentic 2D model without the need for negative prompts

    ✓ Does not produce images with random watermarks or text

    Limitations:

    ✓ Slightly of characters holding objects such as weapons or items correctly

    ✓ Still requires broader dataset training

    ✓ There are still some gaps in the text encoder. There is room for improvement

    ✓ Text cannot generated correctly

    ✓ This optimized for human or mutated human generation. Non human like SCP, Ponies, and more maybe could resulting not what you expecting

    AnySomniumXL v3 Pro tips:

    Because AnySomniumXL v3 trained on 1280x1280, so the resolution on many aspects ratio maybe different than standard SDXL model

    Best Resolution (You could flip the resolution number whether it's landscape or portrait):

    • 1280x1280

    • 1472x1088

    • 1152x1408

    • 1536x1024

    • 1856x832

    • 1024x1600

    More versions will be coming with broader datasets and trained text encoder. Our targets is to produce the most enormous clean datasets for our training. It's recommended to using this model on Automatic1111 webui

    Description

    • Higher resolution 1280x1280 (HD) vs 1024x1024

    • More concept is trained

    • Trained with 133.000+ datasets from various sources

    • Better prompt understanding

    • Captioned with more robust proprietary multi modal LLM

    • Can holding something better

    • Better detail

    • More variety of characters

    • Datasets cutoff is on December 2023

    • Better at understanding texts

    Checkpoint
    SDXL 1.0
    by NCAI

    Details

    Downloads
    199
    Platform
    CivitAI
    Platform Status
    Available
    Created
    1/27/2024
    Updated
    9/27/2025
    Deleted
    -

    Files

    anysomniumxl_v3.safetensors

    Mirrors

    CivitAI (1 mirrors)