This is my second attempt to do a character LoRA. This one was trained on top of CyberRealistic PONY v12.7
Used: OneTrainer, 2x RTX3090
Original pictures: 8
Wan2.2 generated training pictures: 255
CyberRealistic generated generic negative examples: 240
Since only a handful of original pictures were available, I played around with Wan2.2 to generate short (81 frames) of videos, with some simple prompts, which gives the initial pictures some translations. Those then scaled up, than evaluated by Vision LLM to filter out blurry, bad pictures (this part was not working really well). Than a face recognizer was used to select out the outlier faces, which are too much out of range compared to the original. A manual filtering is still needed, to remove partial blurred pictures, bad hands, teeth, etc... roughly 1/3 to 1/4 of Wan generated pictures could be used.
The picture descriptions were generated by BLIP2 automatically. I feel the possibility of big improvements here.
50 epoch training is in, with 30 epoch text training.
I'd appreciate advise, comments, reviews over the model/training. Have fun! And if you have some nice shots made with this LoRA, just post it here as well.
Description
Trigger word: Mitsuki
Base model: CyberRealistic PONY v12.7



















