CivArchive
    natvis-natural-vision - v2.0
    NSFW
    Preview 1

    Description

    NatViS (Natural Vision) is a photorealistic full-parameter fine-tune of SDXL that uses Natural Language prompting to generate high quality SFW/NSFW images. Trained on 1M+ image-caption pairs on a dataset that’s been expanded and refined for over a year.Changelog9-25-24 NatViS v2.0What's New?Prompting: This update focuses primarily on the text-encoders. Natural language prompting capabilities have been improved to follow less-strict formats and relies less on using specific tokens.Ethnicity and Demonym: Increased accuracy of phenotypes for various ethnicities and demonyms. Not just limited to body structure, but also includes clothing, hair, landscapes, ect.. See here for small examples.Camera EXIF: Inclusion of Camera EXIF data for popular modern and analog cameras that can be prompted. Includes, Camera Name, Focal Length, f-stop, ISO, shutter speed, lens type. Also includes attachments such as ND filters, polarizers.Analog: Improvements to analog and vintage photograph generations.Lighting and shadow: Prompt how light (or thereof) interacts with objects/subjects in the scene. Amongst other general lighting related modifiers. More info soon.Skin Textures: Small improvements to the detail of skin textures with less or no explicit token related to skin detail.Implementation of Pseudo Instruction: This will require a more lengthy write-up.Better male anatomy.Lesbians.Will NatViS Understand Everything I tell it?Absolutely, not.Due to various limitations in both the architecture and size of the data I’m able to fine-tune as one person. There will be instances where the model will simply not generate what you want. Often, you experiment with different wording, placement of tokens (i.e., moving a sentence or individual token closer to the start or end of a prompt), remove potentially conflicting tokens, ect… Their really is no definitive solution I can, as it varies from prompt-to-prompt. Unfortunately there will times when no solution/workaround is successful.Can I still use Tags?Short answer: YesSDXL’s dual text-encoder/tokenizer architecture can process tokens/sequences with both encoders in parallel. Meaning, you don’t have to use natural language prompting.Note: Since the training data was purely captioned with Natural Language descriptions, not all the common descriptive tags people are familiar with will be understood by the model. Especially Booru, Booru-style tags.I found a hybrid system works well