Balanced CLIP (1M)
Training CLIP-G took >15KwH of energy, CLIP-L took far less <1KwH
The full negative reinforcement (Cosine Dissimilarity) is available on my huggingface, this was paired with a positive reinforcement (Contrastive Loss) using the full frozen vision model in latent space.
PONY CLIP-L has a further 10 epochs using ASGD for very fine-tuned loss.
Description
FAQ
Comments (4)
Hey just discovered this and saw you have some pretty rad projects on HF. Nice. Could you lay out the pros and cons of these for particular types of use cases?
Thanks, if your referring to the latest update of the clip here, it is trained on a smaller dataset around 10GB but the goal was high accuracy and balance without training out the pony triggers
The Pony Clip-G is really good, but I'm having mixed results with the Pony Clip-L. V2 also seems to be pretty screwed up; V1 works much better. However, sometimes I am definitely still preferring your older Pony Long Clip-L (distilled), whereas I do prefer Balanced Clip-G over NoMerge and JoyCLIP Clip-G!
Overall great work <3
Thanks for the input, V2 might be more closely aligned with SDXL it might rename it

