Added ~60k more images
Lots more image classifiers trained, and existing classifiers improved
Training Notes:
~300k images from Danbooru, r-34, e621 etc...
LR 2e-4
TE_LR 5e-5
batch 4
GA 32
dim 64 (resized with kohya, with sv_ratio=10, and max dim 64)
conv_dim 32
alpha 32
conv_alpha 32 (I meant to set this to conv_dim/2, but ohh well)
scheduler: cosine_with_restarts, 33 restarts
base model NAI
No --vae
flip aug
clip skip 2
225 token length
bucketing at 1024 max 1024 (I should have stuck with max_bucket_size=768 for reduced training time)
tag drop chance 0.1
network_dropout 0.33
scale_weight_norms 1 (either this or dropout resulted in less blown out results, which allowed for longer training at higher LR)
tag shuffling
50,000 unique tags, but about 500 frequently occurring ones (see attached training data for tag list)
about 14 days training time (slower than last because of bucket size choice)