Progress update 2025-05-15 again: I think it's good enough. It still sometimes gives dicks when you didn't ask for them, but not as often. The models are finalized and I'll post them tomorrow morning. After this, I'm going to be drastically expanding the dataset.
Progress update 2025-05-15: It was looking pretty good so I ran some tests and it's putting dicks on everyone, so I'm trying to tone that down a bit.
Progress update 2025-05-11: Half the dicks actually look like dicks now, and I'm getting fewer eldritch abominations. Probably only a few days more in the oven before the next release.
Progress update 2025-05-09: Results are improving, but it clearly thinks that dicks are large, specialized thumbs. Still getting occasional lovecraftian abominations as well, particularly when rendering dicks on ladies. On the upside, it doesn't seem to have forgotten how to make non-lewd images in different styles, and it seems to have shed the irritating habit of putting buttons, weird little bumps, etc, on top of clothing where nipples should be.
Progress update 2025-05-08: I'm getting somewhat dick-like body horror now. It'll take some time.
I will not be locking this model behind an early access fee, but buzz donations are greatly appreciated. Or, just post your generations to this page!
UPDATE: v0.2 is a set of FP8 models that can reliably render nude women and shirtless men. I've figured out some better training techniques for HiDream, and I've set it up to preserve as much as HiDream's incredible style variety as possible. There are fewer nudes in the example images this time around to demonstrate that this deserves to be your base model. :)
It can do:
Photos
Many art styles
Boobies
Vaginas
Pubic hair, sometimes.
All of the other stuff base HiDream can do.
It can't (yet) do:
Dicks
Sex acts
A number of different poses
Please note that I used comfyui-dynamic-prompts for prompt generation, so the crazy long prompts you're seeing here on civit aren't actually the prompts being sent to the encoder.
To really go all-in on art styles the way I did, you need my comfyui custom nodes. I also very strongly recommend merging my pull request (see READ THIS FOR BEST RESULTS below).
Original text:
v0.1 is a (partially) set of FP8 models (full, dev, and fast) that can reliably render topless characters (men and women, although HiDream renders shirtless men just fine on its own). This version doesn't yet render full nudity very well, but I'm working on that.
I'm working on expanding this in order of increasing difficulty. Next up are lady parts, then man parts (penises are hard. to train.), then possibly sex acts. v0.2 is still training, but the results are very promising.
READ THIS FOR BEST RESULTS:
CLIP and t5 don't do HiDreams any favors (I have a strong suspicion that t5 actively sabotages NSFW gens), particularly when rendering NSFW images. Update your ComfyUI to the latest master and load ONLY llama as your clip encoder (set the pulldown to hidream). If you're technically inclined, merge my pull request, which replaces the tensors full of zeros it sends as CLIP and t5 encodings with cached encoded empty prompts, as the zeroed tensors make HiDream a bit janky and unstable.
Description
FAQ
Comments (87)
This is an amazing work! Can I please ask if you are planning to back the models up to huggingface or Tenzorart? Thank you greatly for your work.
I'll push them out to HF shortly.
+1 for tensor art (slowly moving over to there and loaded civitai out of muscle memory, glad i did is this looks good
Uploading to HF now. It'll be done in 20m as of this comment, so it'll probably be there when you see this link:
Using the ClipTextEncodeHiDream node on current ComfyUI nightly with only the Llama textbox populated has the same effect as you are suggesting with using precomputed empty prompts from the other encoders while just loading and using the Llama encoder, right?
It looks that way. I wasn't aware that node existed.
(The difference I can see is that my patch has the empty prompts cached so that you don't need to load the tother encoders at all, which could decrease RAM usage by a few gigabytes and prevent time spent swapping things in and out of VRAM)
when i come across new nodes like the one you mentioned, how do i go about learning how to use it? im new to comfyui and am still learning the different nodes everyday haha
@MistahSwick I usually just go to the node's github page and read it, most nodes also have workflows you can download to see how it works.
If your using Comfy-UI Manager you can just click the name of the node and it'll bring you there.
Edit: didn't realize the node mentioned was a node included with ComfyUI, as i just recently updated. I was referring to custom nodes. Not sure about core nodes, i guess google it?
@Genie123 i figured it out but thank you!
Could you provide a ComfyUI workflow link?
Or several if you want.
I think it would be useful to know how you utilise it to get the most out of it.
Just save any of the example images and drag them into comfy.
@_Envy_ It says "unable to find workflow in image"
@azeli Make sure you're saving them from the image post. I'm guessing that the scaled down ones on the model page don't have it.
If that still doesn't work, let me know and I'll put the workflow up on pastebin.
what about the NSFW t5?
I'm surprised that didn't gain more traction.
I think it has a chicken and egg problem with generation systems not supporting it and people not training with it; needs someone committed enough to it to both train something that will get attention and build support for one of the big frontends to use it without breaking support for existing T5.
@StrugglingSorceror Even with SFW generations, with all my experiments, the conclusion I came to is that prompt comprehension alone was the same or better with just Llama. Llama is just a smarter model, even without aggressive censorship dragging t5 down.
@_Envy_ and llama is an open source LLM, so I guess it's a lot easier to train?
@_Envy_ so instead of the quad loader, you just use a single clip loader and load just the llama?
I've been trying with just the llama, but results are kinda faded and noisy.
@ForeverNecessary737716 I recommend using the scheduler I'm using in my workflow (drag any full sized example image into comfy) if you're using fast or dev.
Do you have a workflow for try your model ? Getting weird result
Download one of the images (the expanded one from the image post, not the scaled down one from the model page) and drag that into comfy. If that doesn't work for you, let me know and I'll drop a workflow on pastebin.
@_Envy_ Thanks! its work :) Also, about your fork of ostris ai tool kit, it's it worth it to use it for train flux ? Have a great dat
@_Envy_ Ah do I have to activate the nodes desactivated ( Con Delta ) or it's not needed ? Thanks
did anyone have luck importing it into draw-things? (-v 24.04.2025 w/ hidream support)
I've never used draw-things. Are you getting an error when you try to import it, or is it acting weird when you generate images?
@_Envy_ its not to bad, worth a try — especially if you have a ARM Macintosh near you!
No, i cant import it. I thought maybe it doesn't recognise it as HiDream, but I was mistaken; they always first publish a version just working with official and community models and usually 1-2 updates after with the possibility to import other base-models of the new "kind".
Question about a couple of nodes you use in your Dev workflows: I decided to try the SamplerLCMDuoFusion and BetaSamplingScheduler nodes that I saw in your workflows. For my initial scheduler, these work great! Thank you. In my workflow, though, I use a second scheduler after a 2x upscale to run the latent through a few more steps to get some finer detail. Normally, I just apply a 0.30 denoise to the latent for that second sampler, but I have no idea how I would do that with these two new nodes. Is it even possible to use these nodes in an Image-to-Image situation like I'm describing? If so, how would I do that? Thank you
Hi there, this is nice, Are you using the SamplerCustomAdvanced ? Getting " cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous "
question, there's another uncensored version that recommends using an uncensored ollama. https://huggingface.co/bartowski/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF/tree/main
i tried it but cant get gguf to work in your custom clip loader
Wish there was fp16 version of this uncensor too. I really do.
It's probably not going to be significantly better than FP8, but I can post it on huggingface for you if you want. Is your goal inference or training?
@_Envy_ Inference. And it would be nice if you manage to post fp16 model on huggingface. People would appreciate it too. Thanks. And maybe its too much to ask but if you also manage to post fp16 Fast variant that would be amazing becasue I sometimes switch to fast model because it generates 10 times faster. Ty for your efforts though.
@Sandoogs Okay, I'll do it soon, most likely tomorrow.
@_Envy_ Oh that's would be amazing thank you in advance.
No FP16 on HF. :( Did I miss it, or did you not upload it? I need it to build my own Q8 variant because Float8 is not supported on Apple Silicon.
I'm newbie to this stuff but I managed to hack my way into getting some decent results. Your instructions were way over my head as well as your workflow. I basically just did a normal workflow using your model and left out all but the llama. Using the new Clip Text encode. Seems to work just fine?
Down the road I hope you can get the finished version more new user friendly. But I'm ok with my results so far. Looking forward to what you have developed this into in the future.
Thank you for this!
It can do: Vaginas
It can't (yet) do: Dicks
Let me guess: gay finetune in the future.
Sooner rather than later. I'm currently in the process of collecting training data. Although the plan isn't to specifically make a gay finetune. I'm going to try to cover all the bases.
Pro Tip; "Normal Human Reproduction tends to require both a ding and a vagoon!"
Is this a new thing? Is it better than pony?
If you mean in terms of quality, yes, by a long shot, pony is based on old stable diffusion models, there is only so far you can go with fine-tuning on that, HiDream is an independent pipeline, its the frontier open weight model right now but more computationally expensive to run, in terms of censorship the base model: HiDream is quite censored, but that's what this finetune is for, although its WIP i reckon it wont take too too long before it replaces FLUX.1.dev for this kind of content (at least for when you want maximum quality and prompt adherence), its also under a permissive license unlike flux.1.dev, so you can do what you like (assuming the rest of your workflow hasn't got restrictive licenced models).
Depends on what you mean by "better". Better SFW prompt adherence, definitely. Pony still knows a lot more lewd stuff at the moment, but this definitely has the potential to get better than Pony eventually.
@_Envy_ Does it work on Fooocus? I'm kinda newby. If so I will try it.
So for a noob using SwarmUI, do I put this in the stable-diffusion folder, the UNET folder, or... where? Thanks!
Interesting model would like to test it out but its still a FAT PIG @15GB model + 1.5gb clip + 5gb Llama :( (
I know I can use smaller clip + Lama models but still 15gb for Hidream model is too big for my potato pc, it slows generation way down 15-20min/image :((
Can you make inmatrix Q2 GGUF version, pretty please with sugar on top?!?!?!?!
And split files model. clips, vae etc for easier download.
Would be greatly appreciated!
TextEncoder other than Llama doesn't seem to have much significant effect, but it appears to have some issues.
When combining multiple costumes and people with different designs in a single prompt, using Llama alone appears to cause a phenomenon where characteristics mix between elements, similar to what happened before SDXL.
This mixing phenomenon becomes particularly noticeable when depicting people who share similar or common features (for example, when creating images of two people of the same gender), where the characteristics of the first person tend to be carried over to the second person.
Well done. This is really good work. The one thing I absolutely cannot seem to get in my images is pubic hair. Every woman is cleanly shaven—even when I prompt for "thick, dark bush of pubic hair". I understand some guys love that look, but I prefer my fake women to at least appear to be 21-years-old. 😁 Please tell me that you're adding some nice bush images to your data stack for the next run. 😂
Even the default HiDream Full can create a good looking realistic pubic hair with some prompts.
Patiently waiting for some good dick...
Nice work too!
Hi, can you add GGUF versions like Q4_S and Q3_S so that low vram people can run it ? Take your time, do refinements as necessary for your model.
Newbie here... could you explain a bit better how to use the cached t5 encodings ? Do you have files one can download ? If yes, where can I find them ? Can that also be done for, for instance, sd3.5 ? I did not understand if you are caching the results of the clip process or if you are changing the t5 file... If you are just using cached clip encodings, then cn we avoid loading into memory the t5 clip encoding ? Thanks in advance.
Is there an existing img2img workflow that would work well with this?
Testing one right now... Stay tuned.
I'm getting "QuadrupleClipLoaderGGUF
'NoneType' object has no attribute 'endswith'"
Any clue on how to solve it?
Don't use the Quadrulpe Clip node. Just use Comfy's standard Load Clip node. Make sure you are on Comfy 3.30 (the newest as I write this) and in the type pulldown hidream will be an option. Then the only clip model you want to load is the llama one.
I have serious crash with hidream🫤
Are you able to run this like any other checkpoint on SD Forge? Always see people using Comfy but not sure if it's required
Same question - I really prefer it to Comfy!
Comfy higly recommended for more speed, CPU mode and alot of flexible features that easy to add or change.
@Sandoogs yeah but it's like the arguement for using a linux distro instead of mac or windows. People don't do it because they're not comfortable on it despite it having insane flexibility. I don't like comfy UI and have been using stable for 3 years and if it's not broke, don't fix it
@turkey910 Most people do it so its comes to personal preference. People who like to just click one button without adjusting anything can use stable or whatever its an option too just very limited one.
@Sandoogs I'm not arguing which is better. I asked if I can run this on forge. I don't need a powerpoint on why comfy is better
@turkey910 The thing is you just arguing and I just recommended you a nice solution. Nothing more nothing less.
You should be able to drop this in wherever you can run the vanilla fp8 safetensors files. I haven't used Forge in ages, so I don't know if it's possible. If Forge hasn't been updated to run HiDream (which would have been in the last few weeks), it's very unlikely to work.
@_Envy_ ah ok cool thanks for response
You should provide where you got your LLama cause I am lost
use the standard HiDream LLama text encoder file.
I can't find the LLAMA clip, I can't get this to work...
https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/tree/main/split_files/text_encoders u can download it here
Please add Q2 GGUF versions of these models!!!
ty so much!!!
How are you doing full finetuning? I thought only LoRa training was available right now (I use AI-toolkit for now).
With a very large dora that I'm merging in periodically. I've added a tiny bit of custom code so I can use the flowmatch scheduler with min snr gamma, and it's working pretty well for me.
@_Envy_ Oh lol youre the guy from Reddit I interacted with haha.
Thanks for the update! Patiently waiting for more
I'm thinking I'll have another alpha quality release by tonight.
@_Envy_ any chance you can release the text file you posted and where to paste it into for the 2 clips? I've been using the normal tensors with a modified t5. I don't know how to edit the files you listed.
You don't need a text file. If you're encoding prompts with all three clips (I don't think Comfy has accepted my changes so everybody else is just going to have to be stuck with the extra overhead of loading t5), just encode blank prompts for clip and t5, and put the real prompt into llama only.
So, I was wrong. It's better than it was this morning, but it's not quite there. Maybe half the dicks look right, and a disturbing number of the remainder have obvious thumbnails on the end of them. I want to aim for at least 75% success, so I'm going to let it run overnight again and check tomorrow. Hopefully it'll be ready by then, knock on wood.
@_Envy_ So i put the files you have listed in the said folders and leave the prompts blank correct? I seen 2 you have 2 diff names on them so I went with the ones listed here.
@aceflier72811 Yeah, put the files in the folders and leave those prompts blank. You'll still need all the files.
Does Forge work in UI
I don't know. This should work wherever HiDream works.



