Going back to my original LoRA concept. V1 is Currenty overfitted. Very difficult to get it to be flexible, takes some creative prompting to get much more than a closeup.
V1.1 is much more flexible, albeit not as detailed. Still a work in progress. Please share your creations you make with it! I'd love to see them!
Did some more tweaking with V1.2, I really like how it came out.
Description
FAQ
Comments (12)
Try 10 repeats for the amount of images in your dataset.
batch 2.
cosine scheduler
prodigy optimizer
dim/rank 64/32
unet learning rate 1
text encoder learning rate 0 ***
min snr gamma 5
*** so by switching the text encoder to zero, you are basically teaching the lora to use the text encoder from the checkpoint you are using the lora with.
sometimes by manually tagging images, the checkpoint doesnt really understand the words you tag with, since it has its own language. by forcing your tags into the lora, the main checkpoint ignores its text encoder and uses the language from the lora - therefore always overpowering your output result. so a tag like "pussy" will draw data from your loras unet dataset forcing closeups.
by using text LR 0 in training, when using your trained lora, instead it will use the checkpoints encoded text knowledge of pussy, and your loras unet image noise knowldege of pussy, and apply it to your final generated image - therefore created a flexible lora that wont overfit.
suffice to say, still use your tags, it will be better to have your tags even with text LR 0
hope this helps with the overfitting.
for anything sdxl pony or illustrious, its best to work with 200-400 steps per epoch. so if you got 20 images - i would say use 12 repeats. 240 steps per epoch, 10 epochs that 2400 steps total for your entire training. then depending on your batch size, that drops the amount, so if you using batch size 3, essentially your lora will be trained for 800 total steps after 10 epochs, with 80 steps per epoch. and this is usually how the 10th and final epoch produces the best results.
30 images, can use 10 repeats, or 9, or 8 even. as long as your dataset number x repeats number = between 200 & 400
Its just abit frsutrating reading reddit when the people say "yeah nah I do between 2000-4000 steps training to get good lora". like whats the breakdown you know?
@emerycum Thanks so much! I gave that a go, I'll see how it turns out after work!
@LylahLuna 🤗🤗
could this be tested by setting lora TE weight to zero after the fact?
@emerycum Used those settings. It is soooo much more flexible. I did notice the intricate details are lessened a bit though. Gonna keep messing around, do some reading and keep working it til its perfect!
I really appreciate the help, I never would have figured that out on my own
@LylahLuna Hey! You're welcome! glad its working for you too! when I experienced detail loss, I made some changes to the settings I use in KohyaSS gui. not sure if you using the on site trainer? but I can check what settings I used and let you know. to also mitigate detail loss, which can be set in any trainer, without advanced settings, I used slightly more images (for example, a set of 45 imgs, I used 60-80, reducing repeats to 6-7) in training data set, and I trained on 1216 instead of 1024
@r5k I wish I could answer your question, but I do not know how to zero a text encoder on a trained lora. I think there are ways using scripts for finetuning checkpoints, but I havent thought about it as far as loras are concerned.
@emerycum Yea, I use the onsite trainer. I tried training on comfy and flux gym, but I've only got 8G of VRAM so it's reeeeeally slow.
@LylahLuna I know the struggle. I got a larger sized card just to try flux, ended up not liking flux at all, but its a huge bonus for training. and games, i guess 😂 the onsite trainer is sufficient, I just wish you could change the prodigy args, because instead of weight_decay=0.5, setting weight_decay=0.2 creates a huge difference with the way weights are normalized during training aswell. the other setting is rank norm and rank dropout, which im not sure can be adjusted in the onsite trainer
@LylahLuna thank you for the buzz 🤗🤗
@emerycum Used these settings for V1.2. I like how it came out so far.
"engine": "kohya",
"unetLR": 0.0001,
"clipSkip": 1,
"loraType": "lora",
"keepTokens": 1,
"networkDim": 32,
"numRepeats": 10,
"resolution": 1024,
"lrScheduler": "cosine",
"minSnrGamma": 5,
"noiseOffset": 0.05,
"targetSteps": 2160,
"enableBucket": true,
"networkAlpha": 18,
"optimizerType": "AdamW8Bit",
"textEncoderLR": 0.00002,
"maxTrainEpochs": 12,
"shuffleCaption": true,
"trainBatchSize": 1,
"flipAugmentation": true,
"lrSchedulerNumCycles": 1

