CivArchive
    Preview 5709952
    Preview 5709946
    Preview 5709979
    Preview 5709972
    Preview 5710001
    Preview 5710011
    Preview 5710012
    Preview 5709958
    Preview 5709968
    Preview 5709996
    Preview 5710020
    Preview 5710023
    Preview 5710998

    A model I originally finetuned for making my cartoon/anime loras - but I accidently ended up just making a model and kept iterating on it. Be careful -- the danbooru images I used tends to move it into NSFW territory. Use danbooru tags. All images straight from the model -- no lora, or inpainting. Use Art by __________ as first tag. Can do SFW or NSFW. List of Artist Styles for Dynamic Prompt can be found at https://files.catbox.moe/w413ru.txt Comparisons can be found at https://mega.nz/folder/gf1xlKBI#FNRxuVUAU-fjx_khegWYhA

    Blah blah blah buy me a kofi https://ko-fi.com/digimdkofi

    Description

    Continued training.

    A model I finetuned for making my cartoon/anime models - Not really designed to be used by itself, but a branch model for making Loras. Be careful -- the danbooru images I used tends to move it into NSFW territory. Use danbooru tags. All images straight from the model -- no lora, or inpainting. Use Art by __________ as first tag. Can do SFW or NSFW. List of Artist Styles for Dynamic Prompt can be found at https://1fichier.com/dir/xazkNCbB Comparisons can be found at https://mega.nz/folder/gf1xlKBI#FNRxuVUAU-fjx_khegWYhA.

    Blah blah blah buy me a kofi https://ko-fi.com/digimdkofi

    FAQ

    Comments (12)

    KorewaaiJan 31, 2024· 2 reactions
    CivitAI

    What is the base model you trained on top?

    spadira272
    Author
    Feb 1, 2024· 1 reaction

    Chimera. Merged my CatEar and SDXL, then merged that with hakurei/waifu-diffusion-xl · Hugging Face, then trained for forever for 1.0. Then merged in Blue-Pencil @ .25, then trained a lot more. Every so often I'll fold base SDXL or wdxl back in and start training again. Dataset is a curated and growing selection of images from various artists and 30 or so pics of top 100 rated pictures for top 100 characters from Danbooru. I just recently had to restart from v.2 since I tried to up the learning rate, and ended up with completely fried images (identical tags would equal near identical images. :/)

    KorewaaiFeb 1, 2024· 1 reaction

    @spadira272 Could you consider training on top of pony? Base pony is hard to steer stylewise. Btw there is a merge of your model with pony derivate - very good result.

    spadira272
    Author
    Feb 1, 2024

    @Korewaai Thanks! - I'll have to check out the merge so I can heart them~

    D00derino584Feb 1, 2024· 1 reaction

    I'm curious, how many images are in the training dataset?

    spadira272
    Author
    Feb 2, 2024· 2 reactions

    @D00derino584  Not that many - I think first run was ~16000? Now its up to 32694. Thought process is finetuning should be done with well captioned, curated, and high quality images, and not just throwing tons and tons of images from danbooru at it (Thankfully, WD and SDXL did that part). I do ~30 - 50 images of the best images from each artist (with more from ones I like/I find interesting) well captioned with Art by Artist (SD upscaled to at least 2160*2160 -- one trainer downscales them, of course). Then a bunch of the best quality anime, retro anime or cartoon images with generic captions. Then run and hope the RNG gods smile upon me. I train the text encoder2 at 1/4 the learning rate.

    KorewaaiFeb 2, 2024

    @spadira272 "SD upscaled to at least 2160*2160" - why?

    spadira272
    Author
    Feb 2, 2024

    @Korewaai Mainly to be dummy-proof -- didn't want to do the math for what size I had to do for what aspect ratio, and I wanted some leeway in case I had to crop face, etc. I just made a quick script, below,

    # Add the necessary assembly for System.Drawing Add-Type -AssemblyName System.Drawing

    # Define the directory path, new folder for small images, and report file path

    $directoryPath = "C:\path\to\your\images\directory"

    $newFolderPath = Join-Path $directoryPath "SmallImages"

    $reportFilePath = "C:\path\to\your\sortedImagesReport.txt"

    # Create the new folder if it doesn't exist if (-not (Test-Path -Path $newFolderPath)) { New-Item -Path $newFolderPath -ItemType Directory } # Get all image files in the directory $imageFiles = Get-ChildItem -Path $directoryPath -File | Where-Object { $_.Extension -match 'jpg|jpeg|png|gif' }

    # Process each image file foreach ($file in $imageFiles) { $image = [System.Drawing.Image]::FromFile($file.FullName) $pixelCount = $image.Width $image.Height $image.Dispose() if ($pixelCount -lt 4665600) { # Move images with less than 4,665,600 pixels to the new folder Move-Item -Path $file.FullName -Destination $newFolderPath } }

    # Generate a report of remaining images sorted by pixel count $imageDetails = Get-ChildItem -Path $directoryPath -File | Where-Object { $_.Extension -match 'jpg|jpeg|png|gif' } | ForEach-Object { $image = [System.Drawing.Image]::FromFile($_.FullName) $pixelCount = $image.Width $image.Height $image.Dispose() [PSCustomObject]@{ Name = $_.Name PixelCount = $pixelCount } } $sortedImages = $imageDetails | Sort-Object -Property PixelCount $sortedImages | Out-File -FilePath $reportFilePath # Display the report file content (optional) Get-Content -Path $reportFilePath

    to filter all images less than 2160*2160 into a separate folder so I could SD upscale them at .01 if they were too small. Didn't want to run into the issue where it tried to bucket a like 256 * 1024 image or something crazy.

    fikxzerJun 4, 2024

    @spadira272  When you said "Then a bunch of the best quality anime, retro anime or cartoon images with generic captions." what do you mean by generic captions, no WD vit tagging? just a few hand written tags? I find all styles of training interesting. I find your models very creative. Probably the most underated models on Civitai(confetti3 is probably my fav but I like the others aswell.) and haveing many images trained on a few tags might help with the creativity.

    spadira272
    Author
    Jun 4, 2024· 1 reaction

    @fikxzer Originally, I used WD vit with a .75 confidence level. The idea was this gave the non-artist tagged images a loose concept and gave SD a broad sense of understanding of those concepts. In my current tests, I duplicate the image, then have one version of the image tagged with WD-vit-3 at .7, and the duplicate tagged at .55. Then I caption the image tagged with .55 using COGVLM with the following prompt, "Describe the image using the following tags as guidance: TAGS TAGS TAGS. Ensure the background is well described." Followed by human correction. I use no quality tags -- I never understood using quality tags as I would assume that would cause concept bleed between prompts, and what is the difference between a "good" image and a "masterpiece".

    I used python and a vision LLM to do this at first -- but tag gui meets all of those needs now, and is less hassle. GitHub - jhc13/taggui: Tag manager and captioner for image datasets

    fikxzerJun 4, 2024

    @spadira272 very interesting!