Collected from
https://huggingface.co/Comfy-Org/Ideogram-4
Description
FAQ
Comments (73)
Do you use LLMs to convert natural language to JSON prompt or what's your method here?
Please share your workflow. I am lost.
yes, just give the llm (I use ChatGPT) the template text from the default workflow and your prompt and ask it to fill it in ,in that format
in the github repo
https://github.com/ideogram-oss/ideogram4/tree/main/docs
there is a prompting.md file with all the info on how to prompt the json. Feed it to ChatGPT/Gemini/Copilot etc and tell them to create a prompt using those directions.
There's "Ideogram 4 Prompt Builder KJ" node for ComfyUI to make it easier.
As to workflows, they are usually embedded in png images.
I just instructed one of my agents in Grok how to create Ideogram 4 JSONs. Pretty neat, pretty simple. It can convert anything I throw at him, so I can render locally with Ideogram 4.
Download this PNG by Silmas and you'll find the workflow: https://civitai.red/images/133048377
Hi! Does it need all of the files from Huggingface like qwen3vl_8b_fp8_scaled.safetensors, qwen3vl_8b_nvfp4.safetensors?
You only need ONE text encoder. fp4 is for Blackwell (50xx) GPU only.
OK, also on Huggingface, Kijay wrote, "It is meant to be used with both (scaled+uncond), but technically does run with only the main model, you will lose quality like that though most of the time. Another thing you can do is run the uncond with the nvfp4 version to save memory, since it's quality isn't critical at all."
@unbound_rider1 i hate the fact we still not have a unified place were to read such important informations. took me 6 hours to collect VERY IMPORTANT infos but also the simpliest one.
We need a sort of wikipedia or a forum. its so exausting. catching up with AI is already an extreme sport. you also have to read tons of reddit and discord. FOR THE F*CK SAKE
FP8 I know is distilled, but whats the difference with uncon? And do I need to update comfy?
fp8 is not distilled, it's just quantized (= lower precision). The uncond model is like a special model just for the negative. Ideogram works without it but output will be better if you use both. And yes, you need to update Comfy, the support for the model is brand new.
I made a separate portable install for Ideogram to avoid messing up my regular install, which hasn't been updated in months.
@Silvicultor To be a bit pedantic, the unconditional one is run for CFG > 1, even if no negative prompt is used.
How to get BF16 version for MAC?
There is no bf16 version. The highest precision that was release is fp8
change it for ibm, it will be much easier to do AIng;)
@sleepmentaffairs Yes, it seems that the only way to have NVIDIA or AMD GPUs on a MaC is through eGPU through a thunderbolt enclosure, and one probably has to run Linux since there is no Mac drivers for them.
So probably easier to just rent cloud GPUs.
@NowhereManGo it can be an option even for some IBM users ))))
Not sure if it suits you, but check this: https://huggingface.co/MLXBits/ideogram-4-mlx
Ideogram is amazing, but on my 4090 just using turbo or default quality it makes my GPU turn into a literal space heater. Its the middle of summer so its not very viable to use, but I will enjoy running it 24/7 this winter. This model is super compute hungry and images take a long time to make. For quality setting it takes 3-5 minutes just to make a single image a 2MP, something Z-image at 2mp can do in under 1 minute. Only use this model if you require pin point precision, like designing. If you're just screwing around Id stay with a fast model like z-image turbo, its not anywhere near as precise, but for fun only.
Glad to hear that you had a change of heart 👌👍. I agree that it is a specialized model, and has a fairly steep learning curve compared to say ZiT, Flux, or Qwen.
But when you need ideogram4's layout capabilities, other models would feel clunky.
You are doing something different then, at 2MP at standard it is done in around 1-2 minutes.
@Silmas OP said "quality setting", so maybe he is using the highest quality setup which is 48 steps vs 20 steps for standard. So 3-5 minutes sounds about right for the 4090 at 48 vs 1-2 minutes for 20 steps.
@NowhereManGo I was referring to 48 steps quality settings, here the results for a 1456x1456 ps picture in quality: 48/48 [02:10<00:00, 2.71s/it]
The image, I made: https://civitai.com/images/133841220
This is utter garbage, utter crap. If you're going to do this lock thing, at least do it correctly! It's causing watermarks on so many normal images; it's ridiculous. May you go to hell and have bad luck every day.
Just use the good workflow and settings, and everything will be fine.
Use kijai nodes, I had zero blocks after start using it.
Just use my workflow and learn how to create JSON prompted. With JSON I didn't have a single blocked image so far
The perfect example of "Blame the user, not the tool". Lol.
@kossan nothing wrong with that, since it's the truth.
@kossan Oh, this is very much the user
When you see "Image blocked by safety filter" it's a typo and should read "Error parsing json prompt"
It made the same mistake as Flux2. We live in an era of lightweight speed, and people prefer to use AI models that are suitable for mass-market computers and can run faster. Ideogram 4 needs its own Flux2Klein.
skill issue, unfortunately. There are many ways to set ideogram up, different noise schedules and cfg guides.. You can just use the one 9gb model with loras and get 90% accuracy as with both.
https://huggingface.co/HauhauCS/Qwen3VL-8B-Uncensored-HauhauCS-Aggressive/tree/main
Swap with the official Qwen text encoder
Not needed, it's not the text encoder that is censored. The transformer (the diffusion model) is. But it's not a big problem if you use the right prompt format (json prompt + bbox info).
Whilst it won't entirely fix the image censor problem, it has other advantages.
@ShadowCell has it any disadvantages/drawbacks too?
@cluster1500 drawbacks will now draw cum on your back, sorry
Pro grade workflow Here: https://civitai.red/models/2696674/ideogram4-pro-grade-workflow-sfw-nsfw
Fp8, definitively I'm going to be able to use this model only in SwarmUI, using this patch.
https://github.com/ComfyNodePRs/PR-fp8-mps-metal-1661feaa
I'm using a MacBook with Apple Silicon.
Ideogram 4 has a built-in safety filter, trying to generate something with this model is a waste of time.
"Good news, everyone!" New sage attention wheels, now supporting Ideogram 4.0, just dropped!
is it already debugged for las Comfy UI update?
@sleepmentaffairs works fine here at ComfyUI portable v0.24.0.
same with ComfyUI Manual ;)
Thanks! Sage attention for Ideogram 4 is a ~20% speedup for me over the default flash attention.
how do you update sage attention? do I just start the update comfyUI batch file? I am using comfyUI protable by the way.
~40% acceleration due to sage attention for me. It's amazing, thanks for the information!
Any guide on how to download + apply sage attention to portable?
@cobaltpixiv520 Always backup your comfy install in case things go wrong, then download the correct file and install with pip:
C:\SD\ComfyUI>.\python_embeded\python.exe -m pip install i:\inbox\sageattention[...].whl
And add " --use-sage-attention" to your startup .bat file.
Super easy! Or it might explode and break everything in a python dependency nightmare. That's why it's important to backup before making changed to versions/python.
can I use this on a RX 9070 XT?
@nonemo13 need to update...
Amazing. Almost twice as fast!
Anybody had success in controlling the ethnicity of female subjects in Ideogram 4? Please do share your experiences.
Use a generic "5 women in a row" type of description as the main prompt and create separate region for each woman: "an African woman, an Australian woman, a Korean woman, etc.
It's not as good at ethnicity as Z-Image, but it's better than a lot of models, and you can get closer with supporting terms ("an Irish woman with ginger hair" etc)
Anyone know if there's any turbo lora or distilled checkpoint for testing purposes?
Not yet. I reduce the number of steps for testing purposes (usually 12 or less).
Hello guys, for people like me who have Ampere series Nvidia card (3000 series). There is a good int8 quant available here
https://huggingface.co/bertbobson/Ideogram-4-INT8-ConvRot
Just follow install ComfyUI-INT8-Fast and it's good for you. U can even load lora in int8 too with the node INT8 Grouped lora after the model. Enjoy =D.
Atm in 1024x1024 it takes only 20 sec for 15 steps on a 3090 Suprim X. I didn't tried if Flash Attention with it. I've noticed absolutely 0 quality loss compared to fp8_scaled version.
PS: I'm not the creator of this quant ^^, just a 3090 enjoyer ^^
效果怎么样
工作流可有?
@xishui8873 i use this workflow https://civitai.red/models/2679071/ideogram-fast-and-quality-ifaq-t2i-by-artgourieff + i've replace diffusion model with INT8 quantized models like that https://ibb.co/4nwzZRSP
Actually it was uploaded here even before fp8: https://civitai.red/models/2674891/ideogram-4-int8
@dayman02400741 我在你前面这个下载的INT8,加载的时候模型这里变成紫色边框,换不是INT8和你的这个工作流能输出,但是我的8G+32G内存内存十分钟输出第一张图片
@dayman02400741我下载一个NT4试试
There's a turbo lora for ideogram 4, and it's good enough to give decent images with only 2 steps.
https://huggingface.co/ostris/ideogram_4_turbotime_lora
For fastest results use cfg1 and remove the unconditional model, but you can still get a good speedup by reducing the step count and keeping the cfg/negative model.
It also adds a little bit of randomness and (subjectively) improves image quality, so it's worth trying just for the aesthetics.
Tried it yesterday. Some results were not bad. Unfortunately it's not good with other loras. Though it's just the first version, so kudos to author for his work.
P.S. It's already here: https://civitai.red/models/2711950/ideogram-4-turbotime
Works very nicely alone. With Realism Engine (only that I tested up to now), the image gets very dark. But it's just matter of time now.



