SDXL 1.0 (Distilled/Predictive)
Full FP32 Model
CLIP-G and CLIP-L are both trained and distilled.
Distilled: The clip models have been distilled from larger text/token projections: teacher/student; goal the smaller model learns the latent shape of the larger.
Predictive: The clip model had separate training where the padding token was used as a mask. Allowing for the clip model to predict some additional context if the 75 token limit is not used.
Description
FAQ
Comments (24)
but can it make spiders?
Likely with 7-9 legs
This one turned out well, little bit of jumping spider mix but for a HD image: https://civitai.com/posts/21891370
@Felldude yeah legs, finger,s toes, always going to be a SDXL issue sadly.
can you send the half size? thanks
I will likely do a 7GB version - unless your using the full fp32 command the model is still only loading 5GB on gpu with 1.3GB offloaded to cpu for clip
i just want a smaller download if i'm trying it out
@Felldude Why forge dont utilize more vram, and offload the clips? Exatly, how you wrote: 4897 mb vram for model, the rest is offloaded. I have 8gb vram.
@macoyedgardo558 Actually forge does use the gpu by default for clip and that is one of the biggest issues I have with it. Putting the clip in FP32 should default the clip to CPU only doubly so for models like flux that would not fit even in a 24GB gpu if using FP32 T5
@Felldude You mean, it always unload, then backload everything?
@macoyedgardo558 The model is loaded in RAM either way, and the cpu is responsible for loading - but the gpu does not need to handle the TE/CLIP with small CLIP's like G & L your talking fractions of a second to process over the gpu, and for large TE like T5 while you might save a few seconds the time to load and unload the GPU is wasted. The exception might be if you could fit the T5 and FLUX diffusion all in VRAM without offloading loading, which requires a 24GB card and everything in NF4
@Felldude Thanx. So for bigger models, comfy is a must.
@macoyedgardo558 I need to look at the script for auto and forge again they used to set cpu only on the clip if the FP32 command was passed but I don't know if they still do. I did 2000 images in flux once and I timed the difference on full CPU CLIP/TE vs offloading every time and it would have been hours.
Is the prompt word logic of this model different from other sdxl variant models?Or does it require a special plugin or workflow?
https://civitai.com/images/97295458 就比如说想要复现这张图片 生成的效果非常抽象和模糊
It would be SDXL with some enhancement to natural language
thanks for the great model, and for the buzz. it's my first pass playing with the model, and I got some gorgeous images out.
👍
Are you using comfyui? If so, are you using the basic sdxl workflow? The custom workflow I used before had very blurry output. Is it because I added too many negative prompt words?
@1q2w3e4rQAZ I'm using SDW-Forge, sorry
i want to see single challenged pic to show the model benefit still can't find , post advantage picture to determine what actually is this !! before downloading.
I do have an article and I can allways find a seed in a seed to seed comparison where a given clip “wins” - Based of cosine similarity the training was successful, of course that means nothing if it didn’t translate visually
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.
















