
Hey everyone,
A while back, I posted about Chroma, my work-in-progress, open-source foundational model. I got a ton of great feedback, and I'm excited to announce that the base model training is finally complete, and the whole family of models is now ready for you to use!
A quick refresher on the promise here: these are true base models.
I haven't done any aesthetic tuning or used post-training stuff like DPO. They are raw, powerful, and designed to be the perfect, neutral starting point for you to fine-tune. We did the heavy lifting so you don't have to.
And by heavy lifting, I mean about 105,000 H100 hours of compute. All that GPU time went into packing these models with a massive data distribution, which should make fine-tuning on top of them a breeze.
As promised, everything is fully Apache 2.0 licensed—no gatekeeping.
TL;DR:
Release branch:
Chroma1-Base: This is the core 512x512 model. It's a solid, all-around foundation for pretty much any creative project. You might want to use this one if you’re planning to fine-tune it for longer and then only train high res at the end of the epochs to make it converge faster.
Chroma1-HD: This is the high-res fine-tune of the Chroma1-Base at a 1024x1024 resolution. If you're looking to do a quick fine-tune or LoRA for high-res, this is your starting point.
Research Branch:
Chroma1-Flash: A fine-tuned version of the Chroma1-Base I made to find the best way to make these flow matching models faster. This is technically an experimental result to figure out how to train a fast model without utilizing any GAN-based training. The delta weights can be applied to any Chroma version to make it faster (just make sure to adjust the strength).
Chroma1-Radiance [WIP]: A radical tuned version of the Chroma1-Base where the model is now a pixel space model which technically should not suffer from the VAE compression artifacts.
Quantization options
Alternative option: FP8 Scaled Quant (Format used by ComfyUI with possible inference speed increase)
Alternative option: GGUF Quantized (You will need to install ComfyUI-GGUF custom node)
Special Thanks
A massive thank you to the supporters who make this project possible.
Anonymous donor whose incredible generosity funded the pretraining run and data collections. Your support has been transformative for open-source AI.
Fictional.ai for their fantastic support and for helping push the boundaries of open-source AI.
Support this project!
https://ko-fi.com/lodestonerock/
BTC address: bc1qahn97gm03csxeqs7f4avdwecahdj4mcp9dytnj
ETH address: 0x679C0C419E949d8f3515a255cE675A1c4D92A3d7
my discord: discord.gg/SQVcWVbqKx
Description
FAQ
Comments (225)
These checkpoints are available on the lodestones/Chroma Huggingface repo. What is the difference between the "detail-calibrated" checkpoints and the normal ones?
It's a merge of two versions, normal and one that's trained with 1024 res., basically more detail at high resolution. I find it also upscales much better now with img2img, and is more coherent when directly rendering at high resolution. Definitely recommended. Although lower initial resolution is still more stable.
加油(ง •_•)ง
Hello, author. I've noticed a problem. Whenever the keyword "male penis" appears in the prompt words, both men and women in the picture will draw a long penis. What's the problem here
its an equal opportunities model
What the hell is male penis? Anyway futanari is the other option, if it knows.
It just so happens that I know about this, and if you try ANAL, it appears extremely frequently (even adding the penis growth tag to the negative is ineffective), whereas using MALE penis in natural language can make only the male crotch appear in the frame without describing the composition, and not the male head. Obviously effective ways to reduce the appearance of penis in females are to 1. add a specific description of the pussy style to the natural language cue when the pussy is visible; 2. use a non-sde sampler
I love this model, very detailed and controllable, until you use the word "penis". Then things go nuts. You can imply it with things like "penetration" and all is fine.
This is my new experiment workflow:
For those who pursue excellent quality and speed.
https://github.com/lin-silas/workflow/blob/fd8f0759d559b381e5d39d41d8bd9d27fccaf196/chroma-experiment02.png
Higher quality:
https://github.com/lin-silas/workflow/blob/e1a230e73f20f9ad860cf03e7cc2e08cda4047e6/chroma-experiment03.png
I published my new wf for Chroma here on CivitAi: https://civitai.com/models/1582668/
If anyone would like to test it, there are different modules you can use, img2img, inpaint, detail-daemon, hires fix, upscale, facedetailer, post production and it can save the final image with metadata so you can upload the image here on Civitai with all the info.
This is the wf I've been using for all the images I've posted on here. I recommend it!
Fantastic workflow. I'm an absolute noob in ComfyUI and this helped so much. Only thing I'm not really getting right now is highres fix. In 100% of my tries it destroys the image and I have no idea why. Even on 1.25 with 0.25 denoise the outcome is broken. Maybe it's because I'm on AMD, who knows.
@Kajoken oh, you mean an AMD GPU?
@Tenofas Yes. I'm using a Radeon AMD 9070 XT right now.
@Kajoken did not know it could run the python libs for ComfyUI... but probably that is the reason.
@Tenofas Hmm, I always thought ComfyUI-Zluda would be able to handle that, especially since Forge-Zluda works fine.
Any way for 8gb vram users to use this or not worth attempting?
You can use the FP8 or GGUF version
absolutely yes! you can use quantized models, that run on 8gb without trouble. check the description, there are the links to all the models you need
I use it with 6gb vram. fp8 and very slow, but it works
GGUF Q5 working ok on my 8gb 4060
Nice thanks guys, never saw the GGUF links, got it working, a bit slow but it works indeed and not bad at all!
@beepbopbip Chroma is slower than Flux, for the moment at least... so yes, it may tasks some minutes to have an image with 8GB vram, it's normal
does someone have the problem?regardless of my prompts, the outputs always give me anime and cartoon image, is my prompt wrong? or can someone tell me what`s prompt can give the realistic image
Put these key words at first:
Pentax 645Z Digital, 35mm photograph, film
or give more weight:
(Pentax 645Z Digital, 35mm photograph, film:1.2)
@silaslin thanks,it improve a little, but a new problem,now is grayscale.....:(
@zczcg put greyscale in negative prompt, you can also add : cartoon , anime, illustration, drawing, sketch, manga etc and every style you don't want,
it would help if you show us the prompt you are using. And also tell us what sampler/scheduler you are using.
Make sure to use the latest checkpoints. Pre~V30 there really was a strong tendency to output anime stuff, sometimes because of just one word ("detailed" was a common culprit).
Also, various Flux LORA for realism steer and stabilize Chroma very wall. Try something like GrainScape UltraReal to really force it, at least as a sanity check.
Sampler: res_multistep
Scheduler: Beta
Steps: 20
Positive: "A candid image taken using a DLSR camera of..."
Negative: "cartoon, anime, manga, illustration, drawing, painting, sketch, 3D, render, blur, blurry,bokeh, rule34,"
You can also just try some of the sample images, they'll possibly contain workflows you can use to find negatives and prompts and settings.
@Tenofas thanks for reply,i have test lots of sampler:euler, euler a,dpm XXX,beta ,sgm,etc, btw, what`s the fastest and quality sample,scheduler, i find my 4090 speed is 1.08it/s, 1 output needs 21~27s, i don`t know it is the normal speed?
@ailu91 thanks, i will try it!is UltraReal a lora?i find it a checkpoint
@makiaevelio543 i try, but it can`t work
@zczcg
I've a workflow that balances speed and image quality. I think this workflow is suitable for you.
You can find it in other discussion threads.
IMO, res_3s & beta57 is the best combination for Chroma.
I think this negative prompt can greatly improve the quality:
(worst quality:1.2), (low quality:1.2), (normal quality:1.2), lowres, bad anatomy, bad hands, signature, watermarks, ugly, imperfect eyes, skewed eyes, unnatural face, unnatural body, error, extra limb, missing limbs, text, censored, deformed,
@silaslin Thanks, I try it, but it already give the anime..
@zczcg ok, let's start from a very simple realistic potrait. Use this positive prompt "Professional close up portrait photograph of a redhead girl with green eyes and long hair looking at the camera, freckles" and this negative "missing fingers, worst quality, low quality, error, cropped, illustration, drawing", Settings: 30 steps, 4.00 cfg, sampler: res_multistep, scheduler: beta, image size 1024x1024, seed 44. You should get a photographic quality portrait. Let me know if it worked. Speed on your card is fine, less than 30sec for image is good with Chroma.
@zczcg what workflow are you using?
ComfyUI workflow: https://huggingface.co/lodestones/Chroma/resolve/main/simple_workflow.json
the offcial workflow
@Tenofas it`s ok,but the person often too fat
@zczcg It's my lastest experiment workflow, check the prompt, the person is ok.
https://github.com/lin-silas/workflow/blob/e1a230e73f20f9ad860cf03e7cc2e08cda4047e6/chroma-experiment03.png
@zczcg Yep, a LORA:
https://civitai.com/models/1332651/grainscape-ultrareal
@zczcg Without seeing it, if you used all the presented information, it's simply because you are using positive prompt tags. You must formulate your prompt into a sentence. If it used to say for SD "woman, thong, sunny", it needs to say for chroma "A candid image of a woman wearing a thong during sunny weather."
tried to write good prompts and then just installed Ollama and let a llm write prompts for me, can feed it my tags and it will give me nicely formulated prompts...
Saw this on huggingface, very impressed! great looking samples!
Does this model work with Flux trained loras
with some of them, as it is not 100% the same architecture anymore be prepared for glitches and some issues, flux schnell loras might be better but have not tried any yet but in the end it needs its own trained lora to work properly, when it is done completely processing...
Another thing is that the weight values will be different. So see if changing the weight of the lora helps.
I tried 2 flux dev lora and both generate a type mismatch error in the cmd window. The generation still completed but ignored the lora. Maybe others will work but its looks hit and miss if the first 2 I tried failed.
tried two dev loras and one worked slightly, the other one threw weight mismatch errors, so, not the best results but found a reddit where people claim it works (https://www.reddit.com/r/StableDiffusion/comments/1kvenmw/psa_flux_loras_works_extremely_well_on_chroma/)...
@Kaleidia Found 1st working lora a hyper lora which is very useful halfed my generation time with 12-16 steps, 8 is too low works but looks grainy low res maybe useful to someone.. https://civitai.com/models/693544/hypersd-flux1-dev-bytedance-optimized-size.
mostly create characters for ttrpgs and had a flux lora for twi'leks. it first gave me mismatches in weight but then worked without problem, double checked with it disabled, chroma does not know twi'lek out of the box...
I installed Chroma on Stable Diffusion WebUI Forge. Question - how can I get Chroma rid of the “cartoonishness” of the image, make the image more photorealistic, more real. Is it possible to do this through promt, if lora for flux do not work there, as far as I understand. At one time on SD Forge for flux there was a sampler [Forge] Flux Realistic (2x Slow) - it gave amazingly realistic images. How to achieve similar results with Chroma?
Start your prompt with " (photo realistic:1.5) 35mm analog grainy " then in the negative add cartoon, anime, sketch etc stuff you don't want. Now make sure the prompt is very descriptive 100 words, describe everything for photo realism. Set Distilled CFG Scale = 1 and the other CFG scale = 5. Sampler = Euler, Scheduler = DDim. I am using 35 steps with a resolution of 576x832, I also turn on Hires fix scale 1.5 with denoise 0.7 and another 35steps again set the hires distiled cfg =1 and the other hires cfg scale =5 in the hires fix section. I installed Chroma on Forge today had the exact same cartoon output, until I realised I have to prompt better and follow the github cfg settings. I am getting great images in Forge on the chroma-unlocked-v35-detail-calibrated_float8_e4m3fn_scaled_stochastic. model so much better than Flux schnell. If you are getting grainy or fuzzy images increase txt2img steps, or send to img to img and set denoise 0.5 - 0.75 with 30 steps to sharpen up the image. Best of luck generating.
How did you get it installed in Forge? Forgechroma patch?
@_Tigerman_ weighting works for chroma?
@beepbopbip Yes I patched a new copy separate copy of forge with this https://github.com/croquelois/forgeChroma, its a bit confusing. (I made a new copy just in case it broke my main version !!!!) You download the patch, follow the instruction 0 on the git hub page, you then have to move some files and directories into your main forge directory supplied by the patch. I had a few errors in the cmd where I put things in the wrong place (its not totally clear) but I recognised the file names and was able to fix it by putting the requested missing flies where they where needed. I didn't clone the repo like the instructions say it didn't work, I started by downloading the forge.zip (from here https://github.com/lllyasviel/stable-diffusion-webui-forge) then let it setup as normal, I didn't checkout the same version as the patch as it wouldn't work, then followed instructioins 4-7. Make sure all this patching etc is running in an administrator cmd window or you will get errors.
@beepbopbip Yes I tried altering weights in the prompt and it was making changes eg (keyword:1.5) made the keyword show in the generation more. Also once I was at about (keyword:0.3) it seemed to remove it from the generation, but I need to test more but it was what I noticed.
@beepbopbip It's simple - download from here and install https://github.com/maybleMyers/chromaforge
@_Tigerman_ Thanks so much for the advice!
wait so should I DL forgechroma or chromaforge haha
@pocketpie You get sd-webui-forge working as normal. Then you patch it with the github files called forgeChroma that allows Chroma to work as a selectable checkpoint in forge.. I didn't test other methods, try both and post your results for others..
better to just start prompt with "real photography of" you won't need any of that other stuff like cameras an negs an so on, it's basically flux based, so in my experience it really doesn't get all the classic sdxl 'realism' 'photorealistic'
make sure you don't use flux turbo detailer lora or you will get odd lines baked into the image
@ilya808rolf994 I got this one working straight out of the gate, but can't get the patch to work within stability matrix. Still I have a working copy.
Great model on the right track, gives superior results compared to flux schnell heading towards flux dev quality in places. Can be weak on empty hands, but also does fix some bad hands(finger count) when img2img and using higher denoise over 0.55. Got crossed arms correct first time, but did hide a hand but looked feasible. Older sdxl pony often gets merged arms hands on crossed arms. NSFW works for boobs compared to normal flux, didn't test further nsfw concepts. This good and I only tested the fp8 version, for a few hours. When its finished training will be a goto model for many ppl.
Somebody tried pony7?
While this is an anime boosted flux, pony7 is realism boosted anime model.
So the future of both models depends on, how many people make finetunes, loras.
Pony has the name and better out of box porn.
I dont see much future for this model, if it will be this slow.
I think you have a bit of a misunderstanding.
You can't compare pears with apples.
PonyV7 is based on a completely different base model (Aura Flow). I am also not sure where you are taking the information that PonyV7 is better. PonyV7 is not released yet and is yet to show uniformly good performance. Chroma is based on Flux 1 schnell, is fully open-source and is displaying very good performance especially in terms of realism, though It is quite slow, that is true. Its also worth mentioning that Chroma doesn't that many steps for great results, around 13-20 is sufficient most of the time.
The more open-source competition the better, Chroma is a great step in the right direction, so we should rather be happy and grateful with more new experimental models being released, trying out modern architectures, instead of sticking to SDXL based models.
Guess what issue Ponyv7 will have when it gets released... It will be slow.
Why not compare? The realistic porn pics here are abysmal. If you want anime, than stick to illustrious finetunes. I want better realistic models, than Pony6 and Illustrious realistic finetunes(XL porn models, while realistic, doesnt have porn adherence compared to the anime models).
So Pony7 is a general model now, not strict anime. If they release, i minimum expect XL level realism with finetunes, with much better adherence, than any porn model.
With ggufs its on flux speed level. Of course Flux has many tools now, which speed up the model.
@insertusername if it ever gets released lol
@rhylankeyson732 NoobAI/Illustrious/Pony mostly tags based models, when on Chroma you can try to prompt using also natural language.
@Meower2024 yeah why are we all talking like Pony7 is going to exist in the same capacity all of these current models have? For all we know, they've built a supercomputer that can only be prompted via a generator like Civitai.
Been trying Chroma out and realistic images look pretty good. I hope you keep improving this model, but I couldn't try it on fictional because that site isn't up and running yet.
It's groundbreaking. 2x slow than f1dev. Results worth for patience. Also is there anyone can find best CFG and step settings for perfect realism?
I get decent results starting my prompt with: "photograph, shot on 50mm lens, natural light, RAW photo, human-eye level, unedited, true-to-life tones, DSLR realism, soft highlights, daylight balance, subtle contrast, zero airbrushing, no enhancements, candid realism, visible pores, subtle skin imperfections, optical lens realism, " then describe the image.
scheduler: beta
sampler: euler_ancestral @ 24 steps / huen @ 12 steps
cfg: anything above 5
I also get great results merging slightly older versions with newer versions for example v27 with v34(dc)
I found that the best result is given by euler ancestral+ beta at 40 steps. cfg5-6. I use a resolution of 960x1152
I've noticed if you try generate realistic images the first few generated images look fine but after you generate six or seven more times in a row then the images start to look less realistic and more anime. I don't understand why that's happening.
This behaviour has been my AI-thorn-in-side-conspiracy for years. The thing is, you're only right if you keep a fixed seed and watch it fucking change. The CLIP space in this model is very large, but if you think about it, the model is the same size as any others -- particularly after it's distilled.
I believe that what you're experiencing is simply confirmation bias: you're hitting a great seed first. It's possible, since I don't know what your workflow is, that there is some form of memory bug. But I'd simply imagine that it's the way the data was trained and captioned.
A trick the bigASP developer learned was if you take two great photos, highly caption one and leave the other simple, then whenever you ask for the simple one, you could increase the odds of getting a a great output. That's it though, it's just odds. And if you don't know the actual CLIPspace of the model you're using, you can't really know what prompts will trigger without the entire dataset and an analysis. Generally the creator may give a synopsis of the randomness rates (like bigASP), but it's entirely the creator's prerogative.
Chroma's developer also has done some creative things to the flux architecture, which should make it react more broadly to your prompts: From the huggingface readme: "If you train [Flux] for a looong time (say, 1000 steps), the likelihood of hitting those tail regions is almost zero. The problem? When the model finally does see them, the loss spikes hard, throwing training out of whack—even with a huge batch size. The fix is simple: sample and train those tail timesteps a bit more frequently ... make[ing] the distribution thicker near 0 and 1, ensuring better coverage."
Essentially, if we were to combine both ideas of training simple prompts -> complex images and also increasing the width of concepts the model can swap too and fro, we may expect to see higher variance over 5 sample images.
@makiaevelio543 Bigasp as a good example? I dont think so.:)
@righteousbrextin106 It's an example of a developer who has finetuned models themselves and provided the information. BigASP absolutely had influence when it released and for the months after. It was an improvement on SDXL as it clearly showed an ability to do what Chroma does much better. This wasn't a pissing match, until you pissed on the wall.
@makiaevelio543 Yeah, it was a shitshow. You bring up his method as good example?
Finetunes managed to unshit his model, thats it.
@malakaicarvell310 Both of you misinterpret my comment the same, may as well be considered spam.
Makiaevelio, I notice it loads Pixart when handling the Clip "Requested to load PixArtTEModel_loaded completely 9.5367431640625e+25 9083.38671875 True
CLIP/text encoder model load device: "
drives me nuts on load time, but also makes me wonder if that is why it's reverting to Toon looking stuff, because the clip comes from PixArt, i dont know much about Clip text encoders an so on. idk am I using an incorrect clip do you know if there is a flux orientated one or am I way off?
@idleminds This log you see referring to "Pixart" refers to the model architecture (it's good you see loaded completely, as this log refers to chroma itself. If your settings were cranked, you may see this log say "loading partially", and it will take much longer. Sometimes necessary, like some WAN runs, for example).
From another comment "it uses a LLM as the prompt encoder instead of CLIP which gives it much better prompt comprehension". It's simply a different diffusers architecture. The reason you'd see it more in anime and mange applications is because this architecture's goal was to improve fine details like text.
The clip this model uses is a universal "t5xxl", or whatever language variant. It's also not why.
The "toon reversion", in my opinion, is based generally on two problems:
1) The biggest being peoples aversion to rewriting their prompts. This model interprets the sentences, and ordering and object + subject all matter to the model. When you use real-text, the model will be more likely to make realistic outcomes. People are reusing old prompts and getting anime because that's what the majority of anime is tagged as in the real world.
2) The other yet smaller problem is settings, like using a sampler like "res_m" that will more default to realistic settings (but not always) and using more steps (like 40 or 50, that will almost guarantee it uses the realism if you're asking for it)
I only get black images. i post one beneath, can somebody help me find the problem?
I think its your random lora stack node, Im not sure could be clip too. I removed your Lora node stack (because I didnt have it installed, used my gguf chroma model (multiGPUGGUF loader).I did also use t5xxl_fp8_e4m3fn_scaled instead of f16 that you had, because I didnt have f16 downloaded. I did get a picture. So I dont think its the ksampler, but the lora stack or the 5xxl f16 that is making the pic black. Try removing the lora stack first see if that help. If it does you can start looking for a solution there.
I posted a picture on my profile that I got from it, should contain the workflow.
@noyboy thanx
@noyboy That workflow is from one of the pictures i posted, so it should work. The only thing i noticed, loading both pictures, is that the black on is missing the a few parameters in the metadata. The smaller ones have a different workflow that also produce empty pictures.
But maybe just an older Comfy Version or the costume nodes were not installed.
@TijuanaSlumlord Sorry, you mean my picture produced empty pictures or user nanunana? :)
@noyboy nanunana's pictures. Edited my previous post a bit. Haven't seen your workflow yet, but if it's using only basic comfy notes it might work.
@TijuanaSlumlord My workflow use multi gpu gguf, but I redid my workflow now so it use the normal Ksampler. Before I used SampleCustomAdvanced. But I just tried with both, and the result was the same. So I cleaned mine up a bit.
Thanx for the help, the problem is solved. The MANAGER didn´t update ComfyUI as he should, i used the comfyui batch and now it works. i will have a look if i can get a few good pix.
is there a concensus on aspect ratio and resolution that gives the best outputs? Getting cropped images in landscape 3:2 if there is a character involved running 3:2 with 1536x1024. If the character is not cropped it shows multiple versions.
The training photos are supposedly 512px prior to the "detailed". I think the official workflow keeps it 1:1. I use 768x1024, which is 3:2, and don't see the repeats that often, but can see body horror under 25 steps. I've read the most successful workflows build up an image using the smaller sizes, masks, i2i into other models, upscaling, etc. People will also use a CLIP skip of 2, and I do think that reduces some of the crazier outliers.
Based on your complaints and my experience, the issue feels like raw pixel size rather than the ratio. For example, in my 768x1024 outputs, I will consistently get what appears to be focused central item (that appears to be something like 512), with filler surrounding it. However, I've fed it 3456x1152 and it understands it perfectly, like way better than I'd ever anticipate -- it practically can create VR-esque outputs. So honestly I'm unsure.
If you aren't using any negatives, I'd suggest at least trying that to form a baseline (although in my case my outputs all turn to JPEG lol). Also, you could try swapping the sampler. If you're using a Flux lora that might also fuck with it(?) based on training images. Could just have also found a concept the model struggles to properly scale.
I read the flux user notes that said keep the x y resolutions divisible by 64. So I use a res of 832x1152 as a default. I think the outputs are better if the image can be chopped up into friendly sized chunks to process, and they help with speed a bit as well. The only other thing I remember is don't go for a resolution that creates an image more than 2 megapixels. I am assuming Chroma being based on flux these rules still apply. I remember in sd1.5 too high a image size you get duplicates appear, I don't know if Chroma does that, I try and keep the image smaller for speeds sake. Reguarding multiple character version errors, did you prompt for a single/solo character at the start? I've had prompts where, I say they are this style and they are in this setting and you get multiple people appear. Because the prompt sound like it was about several separate things.
Did some tests with samplers and seems Euler is the only one that actually gives no doubles and follows the prompt: https://imgur.com/a/BQU2Zsp
Prompt for the tests is: "highly detailed photograph of cowboy shot, body portrait of a standing aesthetic 11, (mid-forty european trans woman with piercing blue-grey eyes and a snub nose, her shoulder length dark brown hair frames her diamond shaped face quite nicely, slightly overweight), showcasing the body from head to thighs, hip level shot, hip level camera angle, extremely detailed photo" even tho mid-forty often goes to fifty or is completely misunderstood...
another test with sampler and scheduler, https://imgur.com/a/9RYEu4k
@Kaleidia I have tried testing your prompt using the res_multistep sampler, and I am frustrated for you. :)
It seems I get about a 50% success rate, so it knows what the prompt is saying. I made some changes to increase the likelihood of getting the subject entirely in frame (I didn't have a test without them in frame after the changes):
"(cowboy shot) photograph of a standing mid-forty european trans woman. She is slightly overweight, showcasing their body from head to thighs. hip level shot. hip level camera angle. She has blue-grey eyes and a snub nose, shoulder length dark brown hair, diamond shaped face. "
One thing that seems to work differently in chroma versus other models is ordering. By pushing the body descriptions up, I now get it more often. Another big change was removing the "highly detailed" tags, because I believe those increase the likelihood of "flux portraits". I also removed the word "portrait", and made sure to use periods at the end of thoughts (which seems to increase the liklihood of commas working grammatically instead of as tag separators). Out of the 13 tests I ran with this prompt, I still got one "double", but both were "cowboy shots". I'd say I got it. Try it out. :)
@makiaevelio543 changing the prompt in that way makes it a lot harder to make it dynamic and most of my prompts are random and dynamic in most cases... But I can incorporate the changes in some of it and try to see if that helps already...
Thank you for testing as well
@Kaleidia I believe this is why many are having "trouble" with chroma though, they're taking their old prompts and its not working the same. Ive slowly just been redoing a whole bunch of prompts the last while -- and you're right, it is harder to incorporate the variety. Sometimes half the prompt will get included in the {...|...} cause I cant get it work without the reordering
@makiaevelio543 I might be able to reuse older prompts by adding a ollama node at the end and tell it to restructure the prompt in that way. will try and report back here ;)
Base prompt: "(cowboy shot) photograph of bastion (overwatch), the extremely detailed photo is dynamic. showcasing the body from head to thighs."
Ollama response: "The camera lingers on Bastion, the omnic support hero from Overwatch, in a gritty cowboy-inspired photograph. The "cowboy shot" style emphasizes the character's rugged and battle-hardened appearance. From a distance of about 10 feet, the camera captures Bastion's body from head to thighs, with an incredible level of detail. Every contour, crease, and texture is rendered with precision, making it feel like you're gazing at a lifelike sculpture. Bastion stands tall, his body clad in worn leather armor adorned with intricate metal plating. His helmet, shaped like a cowboy's Stetson, sits atop his head, casting a shadow over his eyes. A wispy trail of smoke wafts from the barrel of his shotgun, which is slung over his shoulder. As you examine the photograph, you notice the subtle play of light on Bastion's armor, giving it an almost 3D quality. The dynamic composition draws your eye to every detail, from the faint scars on his chest to the worn grip of his shotgun. The overall atmosphere of the photograph is one of rugged determination, as if Bastion has just emerged from a dusty shootout with nothing but his wits and his trusty shotgun to defend him."
https://imgur.com/a/4TfpCfr resulting image. The ollama model did not know Bastion was a tank with a huge shield and chroma just tried to folloe the prompt... (all done on v37 detailed)
Cowboy shots are a problem, might try with "medium shot"
@Kaleidia it's an interesting term, and I was hoping grouping it in the brackets would trigger it together, but it would make sense when asking about overwatch that cowboy would be a big topic (McCree)
fantastic model at this early stage. the community should pounce on it... see pic below
My new experiment workflow:
res_2m & bong_tangent vs res_3s & beta57
which do you like?
https://github.com/lin-silas/workflow/blob/755bb05ea7aafefc3ce560e6e2488266dc8c13b2/chroma-experiment04.png
vs
https://github.com/lin-silas/workflow/blob/755bb05ea7aafefc3ce560e6e2488266dc8c13b2/chroma-experiment03.png
Never thought i'll see the day, where a Flux model can keep up with the random bullshit i'm doing.
Does anyone know if the FP8 models work for the AMD Forge WebUI or ComfyUI-Zluda?
Give it try and let us all know if it does.
@Preshous It does not. I loaded the model and it resulted in my webui crashing when I tried to generate
@ArchAngelAries I don't know anything about AMD Forge WebUI, but I couldn't get the chroma patch to work in my version of Forge however the standalone chroma version from github worrks like a charm. As for ComfyUI-Zluda, I guess that's exclusive to AMD?, there's this version of ComfyUI https://github.com/comfyanonymous/ComfyUI which is supposedly AMD compatible.
Forge has just been patched for Chroma support (15 hours ago), so it should work now.
I don't see any reason for it to not work on AMD too, but i haven't tried it yet.
(Also, there are some experimental builds of PyTorch + ROCm for windows and the official release is near, those forks are probably going to become obsolete)
@AIArtsChannel Oh for real?! Omg omg omg! I so can't wait! If you see anything please send me a link.
@ArchAngelAries Well, there's this one for example (note: it's an UNOFFICAL repository)
You'll still need the official HIP SDK installed, but then those pytorch wheels can just be installed normally using pip
https://github.com/scottt/rocm-TheRock/releases
@AIArtsChannel Sorry, I don't know what to do with this info. I have the HIP SDK & Zluda, Do I install the torch wheels and then I can basically run any local AI program meant for NVIDIA on my AMD 7900XT? Any guidance is super helpful. Also, thank you for sharing all this! ❤️
Can anybody upload some realistic porn(various poses, with prompts)?
I want to know, what is the hype around the model.
There is already bunch of NSFW images with chroma here on civitai. Not all of them has workflow or prompt, but some do.
Edit: my bad, I saw some last week? Maybe they taking down any realistic nsfw images now. It was two women and one man.
@noyboy In this page, what i saw is not better, than any xl porn(rather worse), but of course it needs other prompt approach for sure.
look at my pic below Chroma produceses fantastic porn pix, but all are censored that i posted
@nanunana Good pics. Its seem to react to booru tags. Later there will be amateur styles loras i think.
Amazing🚀🚀, can't wait for future versions. I'm getting very interesting results. Does anybody know how to reduce plastic skin effects? Using Euler Beta, with cfg 3.5-4.5
how many steps? i use also Euler beta with cfg arround 4, but i have no plastic skin...
I found that the best result is given by euler ancestral- beta at 40 steps. cfg 5-6
I tend to do similar setups as I would for Flux, similar natural language prompts, etc., but I have a robust negative prompt, a cfg between 4 and 8, and between 22 and 50 steps. For sampler, ddim or euler simple
Step count is important. You could use the same prompt and same seed at 30 or 40 steps and have drastically different results.
I find limiting or completely removing any mention of "detail" or "extremely detailed" unless it's contextually important (like "detailed material on dress") will reduce "Flux"ness.
I've personally been using the res_multistep sampler with beta scheduler at 40 steps and 6 CFG, with a clip-skip of 2.
From my experience the Euler samplers can find realistic results, but I have the most luck when trying for "art". It really depends what you're going for. I find the "res" samplers will look more like an 'iPhone' than Euler samplers will with the same prompts.
I see a lot of prompts with "aesthetic" followed by a number like 10 or 11, what do these mean and what do they do is it the chroma version of pony's score up keys? also do flux loras still work or has it deviated too much as I've seen several chroma loras so wasn't sure
@Mobbun mentioned this in another thread
@silaslin do you have a link to thread where @Mobbun talked about it
"There are quality tags in the model. You can use them with aesthetic #. It goes from aesthetic 11 to aesthetic 0. aesthetic 10-0 are based on e621 scores per month, with aesthetic 10 based on the highest in each month, and aesthetic 0 being the lowest. aesthetic 11 is a quality tag that was curated by the model maker so expect it to work a little differently."
by @Mobbun
@silaslin thank you, do you have a link so I can learn more?
Flux LORAs work well mostly. I did see a couple that fail at any weight. If you use Forge, you'll need to modify some code, otherwise it will refuse to load them. Comment out these lines:
https://github.com/lllyasviel/stable-diffusion-webui-forge/blob/main/extensions-builtin/sd_forge_lora/networks.py#L25-L27
@ailu91 I removed the # in the section, but I'm not sure it's working properly. Have I done it right?
@stygianwizard42 If these lines have no #, they are active. Which is bad.
You know it works if you see something like this in the console: "Loading (path to LORA here) for KModel with unmatched keys..." with a huge list of keys afterwards.
So, should be like this at least:
#if len(lora_unmatch) > 100:
#print(f'[LORA] LoRA version mismatch for {model_flag}: {filename}')
#return model, clip
Does anyone know the fp8-scaled quantization script used by comfy? That way I can handle it myself instead of waiting for someone else to post it.
Sorry, just saw that the model author has already posted the script on huggingface
The furry artist tags seem oddly spotty. There are some that definitely work and there are some big names don't seem to work at all. Artist tags worked really well in Lodestone's earlier model fluffyrock, so I'm curious if they haven't been as much of a focus for this model or if it'll get better with more training (or if specific artists have been excluded on purpose.)
afaik the artist tags might be added in over time or in one of the late steps to ensure everything works nicely, have patience... the model is still in development and far from done, give it a few more weeks
Very noisy output and no NaN errors in console. what can be wrong? please help. thank you!
Try one of the latest GGUF versions for quick loading times, the <lora:Chroma2schnell_lora_rank_64-bf16:1> or one of the hyper, low steps ones from silveroxide's collection, Euler/Beta, 10-16 steps. CFG scale 2-4 (bit slower than 1 but yields better results). Resolution-wise I'm getting good results with 832x1216 or 1216x832 for landscape. GL.
What the hell is this color palette? Does it know realism at all, or just artsy fartsy shit?
What the hell is this prompting of yours? Do you know prompting at all, or just posting random shit?
@EliteLensCraft Dont be such a fanboy. Just look at the horrible yellow tint pics.
@chadennashon Proper prompting and post-processing + lut nodes do the job.
Share some images.
@chadennashon s-s-s-s-s-skill issue... ue... ue... ue...
this checkpoint is so fantastic!!! Everything that be in my perverted brain was produced with very good prompt following. That is the first time since SD1.4 that this happens... :-)
Is there a comfy workflow? Or it just works with simple checkpoint loader?
ChromaSimpleWorkflow in the hf repo:
Fuckin hell! This is for gays only? I still didnt see any hetero sex pose.
Posted some before, done in much earlier versions. It's wasn't bad. It's better now. Sometimes it needs a prompt with extremely specific small details but in general you get what you wanted.
Simply describe what you want in real language. Not like you've done in other image models. More like WAN.
"From the perspective of an average adult man having missionary sex with an Indian woman as she lays back lifting her legs in the air. The flickering from the television in the suburban living room lights the room."
It doesn't really know what "missionary" sex is. If you try and force it, it'll start using religious symbols. Instead, just describe the people in the scene (she is laying back). Sometimes you may need to specifically call out the other person, or other details, like "She is looking at the viewer.". Then just continue to add modifiers and descriptions.
Like at the beginning you could say "A raw photograph..." or "A renaissance painting...".
At the end you could describe what they're wearing, or continue adding things to the scene.
Also, I think the reason we're seeing much more "gay" content is because this model does a considerable better job at listening to the prompt. Turns out people like all kinds of things?
@makiaevelio543 : I agree, I did many tests in POV and I must say it works pretty well.
In addition, just some side notes: for POV I usually start with "Photograph, in POV." (or "from first person view", or "from POV perspective"), and I write my prompt on multiple lines.
In 100% of my tests the viewer is a male, even if the male genitals are not involved in the scene (e.g. if using hands), but that needs to be formally confirmed.
Also, I don't use anymore "the viewer is...", I simply write "I am...". It's shorter, and couldn't find any example where the former worked better.
@bugltd I use the "i am" written perspective swap, you just gotta be consistent. This model is an awesome way to learn how to use the LLM-clip, cause hidream was slower and more censored. Chroma still can't do letters 100%, but way better than sdxl.
I have consistently changed the viewer character as well. Just say "from the perspective of a <blank>", and then use other descriptors like "the <blank>'s <descriptor> hands" or etc. As long as you continue to describe the other features consistently, the characters will also change.
Detail is much improved in these newer version as is speed. Thanks for your efforts in training this model.
Seeing awesome projects like this makes me wanna get the best hardware possible! Looks amazing can't wait to try it!
I don't know what kind of black magic did you do, but this model performs way better than it has any right to considering its size. My new favorite, and not just for NSFW, but as a general purpose model.
But I'm wondering what detail-calibrated means... What's the difference between detail-calibrated version and non detail-calibrated?
Author's Answer:
https://huggingface.co/lodestones/Chroma/discussions/48
@silaslin thanks!
Need help!!!why every time I change the prompt the clip text encoder needs run again, it take up about over 10 seconds
Every model does this, yes? Maybe you're used to having the full text encoder in memory when you're using whatever model it is that you're used to. Personally, I don't find such things a problem because I load them from RAID 0 from 2 SSD M.2 drives.
@MagicalErotica So is normal situation. Cuz only flux and chroma have this problem,before I use sdxl and noobai only use <1 second
@zczcg Yeah, SDXL is small enough to be kept in memory and the text encoder comes with the model. i.e. 6GB is with the text encoder, whereas Flux Dev. is like 23 gb without the encoder. I have a 12GB video card, and rendering one image on Comfyui using the full Flux Dev is 180 seconds or so, but a batch of 4 is 360.
I've also used the chroma v40 (uploaded yesterday at huggingface). Times are double that of flux.
I hope they make a Tea Cache for it soon.
@MagicalErotica Thanks for replying!Anyway I find wavespeed and Nunchaku can`t speed up and match chroma....I don`t know it`s correct time if I generate 1 image 16 steps cfg 4.5 I need 16 second in 4090
@zczcg Lucky you. I do 50 steps though.
@MagicalErotica the newer versions do not need so many steps anymore, you are burning yur images imo. I use 20 steps but to get more detail out of it I use a res_2s sampler to basically go to 40 steps. also look into sage attention and torch compile, those are time savers. On a 3060 12 gb and using high res images so 5 min are fine per image to me, with smaller images I have similar times as you
@Kaleidia I use torch compile but sage attention will reduce the speed, and it reduces about 4 second,but I still feel too slow to generated one image.
@MagicalErotica Just curious,do we need so much step? And I find the character clothes often broken
@zczcg one other thing to try is to use only t5 for text encoder and set the settings (type) to chroma or flux depending on what you use. that give you more space in ram/vram
number of steps are so far personal preference afaik, i like to use about 20 and was fine with them before using the res2s sampler (which basically doubles the steps with a bit less time). the newer versions of chroma (around v30 or so) do not really need 50 steps anymore imo
@Kaleidia Thks for suggestion, besides, I try to maintain the clothes complete, but i can`t work out..
@zczcg about the clothes
Could you share your sampler, scheduler and prompt?
@silaslin i have used different sampler to try. I find the euler a or res is the best, beta ,beta 1,beta 57..
and my prompt:
Hyperrealistic shaded vintage 1940s-1960s glamour illustrations, with near-photographic finish, in the style of Alberto Vargas, Haddon Sundblom. detailed depicting front view of the full body of photo-realistic pale face 20 years old girl peeking out of a white translucent curtain, the subject is depicted with flawless skin, a perfect hourglass figure, and immaculate hair and makeup.
But the clothes often broken.....let the girl like a beggar....xd
Is it possible to use this with a 4 step lora?
Here are some Chroma hyper and turbo lora's that work at low step. I've not tested them but you can research or test em out. https://huggingface.co/silveroxides/Chroma-LoRA-Experiments/tree/main
https://huggingface.co/silveroxides/Chroma-LoRA-Experiments/blob/main/Hyper-CHROMA-8steps-lora-minimal.safetensors
Works with 8 steps at a weight of 0.15 - 0.2
https://huggingface.co/silveroxides/Chroma-LoRA-Experiments/blob/main/Hyper-Chroma-low-step-LoRA.safetensors
Works with 12 steps at weight 0.6 - 1.0
Thank you @_Tigerman_ @EliteLensCraft!
i have to test all sampler and scheduler. Has anyone a simple workflow for it? i have a node for XY Sampler but i don´t know how to use it.
https://imgur.com/a/5GhMz6G there are nodes in the comfylab package: https://github.com/bugltd/ComfyLab-Pack those should give you some way to test all samplers/schedulers. I use clownshark's res2s with sigmoid_offset (both custom sampler/schedulers) sigmoid_offset was recommended by Lodestone and the res_2s sampler is just nice and lets me use less steps...
@Kaleidia i thank you for answering, but i need a sample workflow. i have enough nodes to make XY plot but don´t know how... as i said before
@nanunana the comfylab has sample workflows... just look at those. also if you have a package installed that provides that functionality, you can go into the comfy menu on "workflows" and "browse templates", there might be a simple workflow example for your nodes there.
@Kaleidia i have no sample workflows, you are no helping hand for me. i also could not find res2 sampler allthough i had download the clownshark´s pack. there is only a res_multistep. nor i can find sigmoid_offset scheduler... :-(
@nanunana sigmoid is installed on a seperate pack, sorry, did not mention. Search for it in the manager. I have no idea which pack you are using for the xy-plot /grid, so I provided a link to a pack with examples and can also find those examples in comfy once I have installed a custom pack. If the author of the pack you are using did not provide examples, do not blame me or others. the examples in comfy are part of the regular updates, so check if you have the latest version installed. I am just trying to help, ignore me if you think otherwise...
@nanunana to the res sampler, it is called "exponential/res_2s" in the dropdown of the clownshark sampler, in other sampler select nodes it is called "res_2s", hope that helps
@Kaleidia i update ComfyUI daily there are no sample workflowa in it. i think u have installed a pack where there are in it, but anyway, i now have even more sampler and need an XY workflow more urgently than before...
@nanunana Reading comprehension is a necessity to work with open source software. https://github.com/bugltd/ComfyLab-Pack
Here, I'll post what they said again: https://github.com/bugltd/ComfyLab-Pack
Here maybe a third time: https://github.com/bugltd/ComfyLab-Pack
@makiaevelio543 and what kind of help is that supposed to be? I come from A1111 and the examples are for SDXL, I lack the experience in ComfyUI to adapt this to Chroma. Your smart-ass remarks help me less. Just writing one thing three times, at what point is your brain hacked out?
@makiaevelio543 In A1111 the x/y/z plot ist included and i like this feature very much to compare Checkpoints, cfg, seed, samplers. This does not seem to be so easy in ComfyUI, otherwise there would be example workflows whose link could simply be passed on.
@nanunana I mean you stated multiple times you wanted workflows, that github page, which contains the nodes that were referenced, also contains a bunch of wiki pages. https://github.com/bugltd/ComfyLab-Pack/tree/main/wiki/tutorials
How do you think anyone learned anything? I'm just saying that when someone gives you a link containing the answer to your problem, and then you continue to demand for the things that are in the link that was already provided for you, how should anyone respond to you?
@Kaleidia + @makiaevelio543 : I'm so honored to see my extension recommended, thank you so much! May I ask how you discovered it? Via the search function in the Manager?
If you have any issue / feedback / suggestion, I'll be happy to help.
@nanunana : I'm the creator of the ComfyLab-Pack, if you provide me your workflow I may be able to help.
In a nutshell, to adapt your workflow to this use case:
1. Add 2 "List from Multiline" nodes and fill them with the list of samplers and schedulers (1 per line), as they are displayed in the dropdown list
2. Pipe the output of both nodes into the "XY Plot: Queue" node
3 Pipe the green outputs of "XY Plot: Queue" into the "sampler" and "scheduler" widgets of your KSampler (if you use the simple approach), and pipe the purple output into the "XY Plot: Render" node
4. The "XY Plot: Render" has 2 outputs: 1 for the current image, and 1 for the grid(s). You may want to either display them, or more probably save the grids.
To customize the size and appearance of the grids, you'll find more details in the first 2 XY Plot tutorials, or in the node reference: https://github.com/bugltd/ComfyLab-Pack/tree/main/wiki/node%20reference/xy%20plot
@bugltd I found it through the comfy manager and used them to find good settings on newer models like HiDream or Chroma here. I like the sampler select and scheduler select nodes as I do not need to type out the samplers by hand ;) Did some grids back in a1111 days (two years or so ago) and wanted to do similar stuff in comfy, good nodes for that :)
@bugltd thank you for the link but i'm completely out of my depth with it, as i said i come from A1111 and am not a programmer. I think it's a pity because there is so much concentrated knowledge here that nobody can provide me with a workflow of 4 nodes (which is probably a minute's work for you guys).
@Kaleidia : thank you for your feedback. I must say your messages and those from @makiaevelio543 give the energy to spare my time on my extension, thank you again.
Another approach you may use is by using the "Output Config" node, which allows to save a bunch of data in a JSON / YAML file, and generates outputs accordingly. Ofc, you can write down some lists (of checkpoints, samplers, ...) just once, and reuse this config file in multiple workflows.
Unfortunately I recently discovered that this node is broken, following an unexpected update in the Comfy Javascript API. But that should be corrected by the end of this week I think.
@nanunana : no problem, just give me a few moments and I'll provide you a workflow, adapted from the standard Chroma one: https://huggingface.co/lodestones/Chroma/blob/main/ChromaSimpleWorkflow20250507.json
@nanunana : https://gist.github.com/bugltd/a04bb8fbb6b3d02d75683fc3539fec2b
Just save the JSON file and drag-and-drop it into your ComfyUI interface,
Ofc, adjust the models, prompt and parameters to your env and liking.
For the samplers / schedulers, you'll have to temporarily disconnect the outputs from "XY Plot: Queue" to see the values in the sampler / scheduler widgets. Or, as @Kaleidia indicated, you can replace the 2 "List: from Multiline" nodes with the "List: Samplers" and "List: Schedulers" respectively, that allow you to just select without having to write down the sampler / scheduler names manually.
Hope this helps, have fun using this incredible checkpoint that is getting better and better at each version (about 2 per week)!
PS: you can get a newer Chroma version in the HuggingFace repo (currently v41): https://huggingface.co/lodestones/Chroma/tree/main
@bugltd thank you for your effort but the workflow is of no use to me because i don't know how to select the samplers and schedulers. i'll continue to do it manually now i've come a long way. thanks anyway
i'm annoyed now. there are the right nodes in comfyui. but how does it continue. and in the preview, various fields are displayed in the node. but when i call up the node, i only have 3 fields for the sampler. how do i get more fields in the node? https://ibb.co/qvrjLQ5
@nanunana : the workflow I provided is based on the ComfyLab-Pack extension, so I won't be able to provide much help about a different extension, you should probably contact its author instead if you absolutely want to use it.
But, as I see it, the node you use is designed to have max 3 samplers indeed, it was built like this.
As I stated above, if you use the ComfyLab-Pack extension: "you can replace the 2 "List: from Multiline" nodes with the "List: Samplers" and "List: Schedulers" respectively, that allow you to just select (the ones you want)".
These 2 nodes allow an unlimited number of samplers / schedulers, as I used a quite uncommon trick in my code (haven't seen other extensions do it tbh).
@bugltd so, i've got through your workflow, it's still very unclear to me. and it works (the test has probably been running for 3 days now.) thank you very much for that
Thank you for your work! The model is awesome! Keep going! 👀
Can anyone tell me what is the difference between the "normal" version and the "detail calibrated" one?
This looks very promising. I am hoping it works well with Forge/a1111, and with Flux LoRas without changing the LoRa appearance.
No Flux Loras
Forge finally received a commit for Chroma support last week (we had to use an external patch until then), works well enough
@nanunana Yes Flux loras, but needs a fix, even then a couple don't work at all
https://github.com/croquelois/forgeChroma/issues/4
Regardless of ui, flux loras might not work with Chroma, some do, others do not. Try the ones you have and see, there is no list of working ones as this is all a work in progress... we are at v40 or 50, the last few iterations are said to take longer than the current 4 days between releases, so we might be there in 2 month time. Then people can make lists and tutorials as things are not likely to change much for a time being. Atm each new release makes some thing better, others could break and workflows might need to be changed... have patience...
It doesn't really need many LoRas. I'm sure people will train some character and style ones after it's complete.
@Kaleidia thank you for the info. I keep downloading the most current from HF, and am successfully using them with Forge. The character LoRas that I have created with fluxgum don't seem to have any effect. I am not savvy enough to try more complicated methods of creating LoRas. (I did edit the models.yaml file in fluxgym to add chroma-v41), to see if it would work, but was unsuccessful. But, I do love the progress with Chroma! It's all I am currently using now. I will try to upload some images to Civitai soon.
Even though it hasn't gone into high resolution training yet, it's been available for normal graphing since v39. Hands and feet still need to be touched, but what's impressive is that when generating photographic styles, it often gets very professional lighting, which is hard to see in other models.
I've noticed that too. I'm a photographer myself and I think about a quarter of the pictures: Oh, nice lighting. I would have done the same.
ich habe mal ein Bild gepostet...
@suede2031691 : afaik, hires training has already begun, and probably earlier than expected (I read it was planned for v48-v50).
Hence the "detail-calibrated" versions: https://huggingface.co/lodestones/Chroma/discussions/48
If you take a look to the Live AIM Training Logs, you'll see that a "large" task has appeared a few weeks ago, in addition to the "base" and "fast" ones, I guess it's linked to that: https://training.lodestone-rock.com/
@suede2031691 , just in case here are a few humble findings (but you may already know them):
You probably know the right camera lens / settings better than me. As I know basically nothing about photography, apart from some lenses (Bolex H16, Lumix GH3, ..), I use a different approach to let the model decide and surprise me:
1. Adding "The atmosphere should be..." (romantic, vivid, dreamy, tender, happy, ...): at the end can greatly improve things. It will not only impact lighting / camera, but also composition and possibly the subject(s)'s expression and pose. You can use "mood" instead of "atmosphere", seems equivalent.
2. In addition, I sometimes write after that: " Select camera and lighting settings to emphasize this atmosphere". Not a 100% guarantee, but works pretty well with some of my test prompts, and gives a good variety of renderings.
3. For photos with actions, "This photo is dynamic" can sometimes improve the immersion. But I avoid the keyword "dynamic", as I found it tends to bring more illustrations than photos.
He plans to train the last few epoches at a higher res to get smaller details like hands and feet down, though apparently each epoch at high res will take about a week each.
Any guide to use with diffusers? I am struggling with scheduler setting and cfg..
Euler beta and cfg4 work like a charm for me.
i've been looking for an XY workflow all the time to test all the combinations but i can't seem to find one i have nodes for it but don't know how to put them together...
I has the best results with Euler, 26 steps and cfg 4 for the "normal" model. res_multistep is also good. And i use the Sigmoid Offset Scheduler custom node in Comfyui for chroma. For the few step model i use eula, 12 steps with cfg 2 - 3.
I get my best results with Euler/Beta CFG 2.0 16-20 steps (acceptable at 12) and the Chroma2schnell LoRA
i'm currently doing tests with samplers and schedulers which i can do thanks to @bugltd . what i notice above all is that you can create a lot of fundamentally different images with one seed. i'm currently trying to make a list of which combinations create similar images so that you have alternatives for this one image type. If there is interest, I can post it here. But that will take some time because the whole thing can change completely depending on the cfg and steps.
That would be nice, yes.
first small recognitions: the karras and exponential scheduler works with no sampler. The res family, as @Kaleidia already said, is very good but doubles the creation time.
@nanunana : I imagine you already tested sigmoid_offset that @Kaleidia recommended. In case you want to explore even more schedulers, you can also test the AYS series (Align Your Steps). I get mixed results, but sometimes they bring a nice level of details / lighting / ...
To get them, just install this extension: https://github.com/pamparamm/ComfyUI-ppm
No need to change your workflow, they will be added to the scheduler list. Also provides beta_1_1, which can be interesting maybe.
I also have beta_57 in my list, but I can't recall how I got it
@bugltd beta57 is installed with the clownshark´s nodes together with bong_tangent. Both very nice.
