Overview
I am an artist with complete Aphantasia. That means I do not have a mind's eye. I used to think the phrase "picture it in your mind" was a figure of speech. When I discovered the vivid world inside the minds of others, I was heartbroken. Around the same time, a friend introduced me to Stable Diffusion. Here was something that could turn my mental narratives back into images. It felt like a gift.
I set out to really explore the possibilities of generative AI as a concept visualization assistive technology. I vowed to create and share resources to enable this for others. That's how this checkpoint began. For this purpose, I believe a successful checkpoint should be:
General purpose
Efficient
Accessible
I set about finding out how to define and accomplish that. I found Civitai. By looking around, I arrived at definitions for those terms, which became goals for this checkpoint.
A general-purpose checkpoint can generate images of many different styles, is based on a model that is, and shows promise to continue to be, popular, and works well with LoRAs.
An efficient model can generate useable images often and in few steps. Looking around at LCM and Turbo models, four steps became my target.
An accessible model is readily available at a popular destination and easy to prompt for.
I then began experimenting with block merging. I decided on the SDXL family with the Turbo version being particularly attractive for its speed. Then I merged the SDXL DPO U-Net to increase output quality. I was just about satisfied with the result of 1024x1024 images at four steps when SDXL Lightning was introduced. It was something which I could not ignore because it so closely aligned with my goals, so I postponed my plans to upload the prior version to incorporate any gains SDXL Lightning could provide. That formed the basis for the first version.
Version 1
Based on SDXL Lightning, version 1 performs well at the target size of 1024x1024 and in four steps with DPM++ SDE Karras at CFG Scales 1-2.5.
Version 2
Based on SDXL Hyper and finetuned with a dataset of over a thousand curated images, version 2 performs well at the target size of 1024x1024.
In an efficiency improvement over version 1, great results can be achieved in three to four steps with DPM++ SDE Karras at CFG Scales 1 through 2.5.
Additionally, amazing images can be produced in five to six steps with Euler Ancestral Simple at CFG Scales 1 through 2.5.
Version 2 + PCM
Incorporating the best parts of Version 2 and a Phased Consistency Model, generated images are similar to version 2, but perhaps a bit more vivid. In the same configuration, it produced more visually pleasing results with AnimateDiff SDXL.
Like version 2, great results can be achieved in three to four steps with DPM++ SDE Karras at CFG Scales 1 through 2.5.
Additionally, amazing images can be produced in five to six steps with Euler Ancestral Simple at CFG Scales 1 through 2.5.
Description
Efficiency increase over version 2.0 enabled by incorporating a Phased Consistency Model (PCM). In testing, this version provided a very slight improvement in aesthetics of generated images but seemed to consistently outperform version 2.0 in AnimateDiff SDXL video output quality at the same settings while maintaining RePhantasia 2.0's signature look.
FAQ
Comments (6)
Using it for two months and love it more every day. In have tried several models since and keep coming back to this one. It is by far the best tool I have.
Thank you so much for sharing that! I am very proud of it and will continue to work on it as long as it inspires and enables artists to explore creative concepts and visualize their ideas. Your support means a lot.
Insanely Fast and interesting results. Surprisingly consistent, but manageable for tweaking to the desired result. This will likely be a favorite model for me.
Well done.
Thank you for the kind words!
It is hard to believe that anything could top the last rephantasia but this does and more!
It is now my go to model and will remain so until something better comes along (sd 3.0?)
Thanks for all your work!
Aphantasia? So it has a name, as I've always been unable to visualize things -- although not entirely, and it comes and goes, I'm rather colorblind, too, yet I enjoy art, it frequently is frusting. My encounter with doctors has been entirely negative, which would be a long rant in itself, but for now, the tl;dr of this is 500 buzz for moral support, and an additional 500 buzz for immoral support!
Cheers!


