
Hey everyone,
A while back, I posted about Chroma, my work-in-progress, open-source foundational model. I got a ton of great feedback, and I'm excited to announce that the base model training is finally complete, and the whole family of models is now ready for you to use!
A quick refresher on the promise here: these are true base models.
I haven't done any aesthetic tuning or used post-training stuff like DPO. They are raw, powerful, and designed to be the perfect, neutral starting point for you to fine-tune. We did the heavy lifting so you don't have to.
And by heavy lifting, I mean about 105,000 H100 hours of compute. All that GPU time went into packing these models with a massive data distribution, which should make fine-tuning on top of them a breeze.
As promised, everything is fully Apache 2.0 licensed—no gatekeeping.
TL;DR:
Release branch:
Chroma1-Base: This is the core 512x512 model. It's a solid, all-around foundation for pretty much any creative project. You might want to use this one if you’re planning to fine-tune it for longer and then only train high res at the end of the epochs to make it converge faster.
Chroma1-HD: This is the high-res fine-tune of the Chroma1-Base at a 1024x1024 resolution. If you're looking to do a quick fine-tune or LoRA for high-res, this is your starting point.
Research Branch:
Chroma1-Flash: A fine-tuned version of the Chroma1-Base I made to find the best way to make these flow matching models faster. This is technically an experimental result to figure out how to train a fast model without utilizing any GAN-based training. The delta weights can be applied to any Chroma version to make it faster (just make sure to adjust the strength).
Chroma1-Radiance [WIP]: A radical tuned version of the Chroma1-Base where the model is now a pixel space model which technically should not suffer from the VAE compression artifacts.
Quantization options
Alternative option: FP8 Scaled Quant (Format used by ComfyUI with possible inference speed increase)
Alternative option: GGUF Quantized (You will need to install ComfyUI-GGUF custom node)
Special Thanks
A massive thank you to the supporters who make this project possible.
Anonymous donor whose incredible generosity funded the pretraining run and data collections. Your support has been transformative for open-source AI.
Fictional.ai for their fantastic support and for helping push the boundaries of open-source AI.
Support this project!
https://ko-fi.com/lodestonerock/
BTC address: bc1qahn97gm03csxeqs7f4avdwecahdj4mcp9dytnj
ETH address: 0x679C0C419E949d8f3515a255cE675A1c4D92A3d7
my discord: discord.gg/SQVcWVbqKx
Description
FAQ
Comments (78)
This may be one of the best models we've come across but for goodness sake insert some pictures of how it does various things along with prompts. I would like to see how it does landscapes, does it know real people, any styles? in the sense it does anime but does it recognize photo settings and lighting, is it nsfw? Well brag about what it does
As it is based on Flux, does it support Flux controlnet and lora trained on Flux?
The LORAs works, but they tend to results a bit different. Controlnet don't for the time being.
@A_Green_Orange Thanks for the reply! I guess it is like SDXL/Pony lora on Illustrious that kind of stuff
@LovelaceA It feels this way yes.
Wow ! Thank you for thys model ! Is Great , i realy apreciate it .
So, what does it do that Flux cannot? Any side-by-side comparisons?
Thank you
It follows instruction way better than flux. In my experience with it, it is the best model at following instructions.
It fixes the flux chin / plastic look of flux.
It is smaller than flux, so it can run on more hardware.
It has some knowledge of NSFW stuff (not perfect, but way better than flux).
It runs slower than flux since you have to use a config greater than 1.
Once community tooling starts to build around it, it might become the next sdxl.
Is there a good workflow for this model? The available ones on Civitai lack the node ChromaDiffusionLoader even after installing FluxMod.
ComfyUI recently added native support for chroma, you should be able to just simply use Load Diffusion Model or other comfy core nodes i guess and use load clip node with type set to "chroma"
edit, you should be able to copy the workflow from this image https://civitai.com/images/73901839, requires latest comfyui (there is a PC schedule prompt node which you can ignore you can just delete it)
@DraconicDragon Thanks!
I just learned about this model, and the handful of images I have generated have been impressive in terms of comprehension compared to sdxl/flux/others.
This model IS really what we all wanted and desired for quite long time. Simply fabulous
I can't do Inpaint for this model and VAE. I get the error: "VAEDecode Given groups=1, weight of size [512, 16, 3, 3], expected input[1, 4, 128, 90] to have 16 channels, but got 4 channels instead"
Well, Done, I think you did it! Finally a flux model that can do all kind of stuff. It's not perfect yet but very promising. If you keep working on it I'll probably switch from realistic illustrious/Pony to this. If you can keep adding Anime/danbooru concept to this while keeping the realistic style when requested it will become the definitive model.
What sampler, scheduler and steps do you recommend for your model?
Deis + the OptimalStepsScheduler comfyui node or kl_optimal if you use a ksampler.
From the 3/4 hours I played with it, I got the feeling that the model converges in between 25 to 35 steps.
I have a very basic workflow on the last image I posted if you wanna steal it :)
@A_Green_Orange Thanks! That workflow is nice.
i am experimenting around. Meanwhile with V29.
dpmPP_2M
Beta
Steps: 27-30
60+ Steps are more better but sometimes makes a different picture.
CFG 7.5 works well.
Chroma gets bad artificats/"Flux Lines" if you use a really long prompt, does anyone know why this happens or how to stop it from happening? I tested it in this post here: https://civitai.com/posts/16428655
I will qualify this by saying its a bad suggestion, but if you're using comfy, you could "simply" add another sampler after the flux sampler using a different diffusion model, apply noise, and have it quickly remove it.
HOWEVER, if you look at this thread:
https://github.com/lllyasviel/stable-diffusion-webui-forge/issues/1712
They seem to believe this can be exacerbated by a bad setting in Lora training that is made even more obvious when using large resolutions. There was a suggestion that possibly not using certain lora blocks could save you when applying already broken loras, but if you're seeing it baked into the model, it's likely someone already baked a broken lora
I have not seen the lines in my Chroma tests -- however if Chroma is tuned using Flux-generated images at any point, depending on your parameters, you could stumble into this bad data perhaps
It's based off Flux. Flux has striping issues whenever you significantly exceed the resolution that it was trained on. ~1400 res in either direction for base flux, and I don't know what this was trained on, but your 1504 resolution in your workflow is too high. Try dropping it back to 1344 max for either direction. Upscale with Ultimate SD Upscaler with tiles that are within that res max and you'll generally do well with it. I tried the prompts in your rabbit images at 1344 max res and they came out fine with no striping for me. I've also noticed other Flux loras where they trained it only at 512 res because of limited GPU availability , so even rendering at 1024 res would give striping.
@floopers966 But the issue doesn't happen when the text prompt is short and the resolutionis high, that's what interested me. I will have to test if a long prompt at a lower resolution has the same effect. Flux is usally fine upto 2 million pixels, I have generated 1000's of images at this resolution and will not get the lines unless using a badly trained lora.
@J1B I have noticed something similar and will investigate further. Perhaps you can ask in the lodestones_rock's server on discord
I'm having this trouble too after training a LoRA. When I apply the LoRA I get a lot of striping
@RedDeltas That is a know issue with Loras and Flux models, did you train at 512x512 resolution by any chance? that is my theory on what causes it sometimes, although ChatGPT thought it was caused by overtraining certain blocks.
If you try skipping blocks 1-3 (or more) it can help : https://www.reddit.com/r/StableDiffusion/comments/1g7b76e/a_possible_fix_for_the_gridlines_effect_when/
but might be a bit different with Chroma.
@RedDeltas Do you have the T5 setting applied from the github thread?
@J1B Interesting, I trained at 1024x1024. I'll experiment with the block skipping - thank you ☺️
@makiaevelio543 Thanks, I'll take a look
@makiaevelio543 I am using t5xxl_fp8, With Type: Stable_Diffusion. I will try updating ComfyUI to see if there is a new Chorma Text type but people in saying that doesn work: https://github.com/lodestone-rock/ComfyUI_FluxMod/issues/34
Congrats! Keep going and you will have unseated the other models. I've messed around with 1.5 to Illustrious, this is the best model I've ever seen.
Q8 GGUF version here is 6GB smaller:
https://huggingface.co/silveroxides/Chroma-GGUF/tree/main/chroma-unlocked-v27
Links to the "official" quants:
https://huggingface.co/silveroxides/Chroma-GGUF
https://huggingface.co/Clybius/Chroma-GGUF
If they are outdated, ping me on our Discord server.
Are these going to give the same quality? Like if I use the Q8 GGUF
Hi! This thing looks awesome! But does Forge support it? Thanks in advance?
This works - https://github.com/croquelois/forgeChroma
@tapczan that's awesome!
They also have a very important issue that solves blocked LORA support:
https://github.com/croquelois/forgeChroma/issues/4
And even Flux LORAs will work.
Chroma with LORA is absolutely fantastic
very nice model, just needs a little help with hands and faces, controlnet will be great with this i think
It's a promising model for sure, however still a lot of anatomical issues. 20 year old females are generating vulvas coming from females 2 to 3 times their age, giant elongated labia, skin tone does match the rest of the body. Hands and feet usually distorted. Different body weights between legs and torso.
OK, so the problem is too slow, any speed up solution?
@totes Any recommended settings for those Loras?
e.g. I tested by trial and error:
https://huggingface.co/silveroxides/Chroma-LoRA-Experiments/blob/main/Hyper-Chroma-low-step-LoRA.safetensors
Right now weight 1, steps ~ 12 seems to do the trick.
But its kinda hard as results vary with use of negatives and prompt length?!
Training has changed a bit with v29 to make it takes fewer steps to get coherent results.
@totes Sorry, I've been using A111 the last few years and am just now switching to Comfy, but where would I load any of these Loras into this workflow?
you can use the version 29.5 (check huggingface link)
it should be able to generate coherent image at lower step counts now
@welcometoea you'll want to separate the diffusion node and add it to the back of a load lora node and then connect the front of the load lora node to the diffusion node, then do the same for the clip.... i suggest downloading rgthree-comfy https://github.com/rgthree/rgthree-comfy, and using the power lora loader, but oen of the standars should work fine, as @Lodestone said though 29.5 uses less steps so one of the fast loras aren't necessary, but there are a few to tinker with. Also sorry for the delay.
Hi Lodestone
Great Model using new V29, what's the deal with LORA models on this its a flux model from what i can see but my flux lora look nothing like they are suppose to is there something i am missing or do we need different type of lora model. thanks
As far as I understand this model is "in development", meaning it's still not set in stone. Even if some loras may work a little, until they are trained over the final model, it's no use. One working on v29 might stop working on v30, and so on. Anyone correct me if I'm wrong.
It is not a Flux model. It is it's own model and the design is BASED on the schnell design, but it appears @Lodestone has their own dataset and I'm guessing they made other changes. That's my understanding, but I'm not positive, hopefully @Lodestone can jump in here and let us know.
The fact that one person can manage to do this is amazing to me and I want to learn to do it too. Crowdfunded AI model training is such a cool thing, and that just goes for double because the model is fucking good. Let us know @Lodestone !!!
the model is heavily modified
it's initialized from flux schnell but that's about it
this model is basically not a flux model anymore
tho it's still maintain partical compatibility for the existing flux loras
Excellent model, I'm afraid to imagine what will be in the final version :)
With these settings my generation speed increased by 2 times:
### ComfyUI = 0.3.33
- pytorch version: 2.7.0+cu128
- use-flash-attention
### Load Diffusion Model:
- unet_name = chroma-unlocked-v29_float8_e4m3fn_scaled_stochastic.safetensors
- weight_dtype = fp8_e4m3fn_fast
### Load CLIP:
- clip_name = t5xxl_fp8_e4m3fn_scaled.safetensors
- type = chroma
- device = cpu
Add a Bitcoin address, and a Lightning address too for small instant donations! WalletOfSatoshi is good for an easy start, but there are other custodial and non-custodial wallets as well.
Awesome, truly uncensored flux finally and next gen stuff
This model seems like it's getting worse, actually. Anatomy like SDXL , bad hands, bad feet. It's not Flux level at all. Yet.
The prompt comprehension is excellent though.
Chroma is still in "pretraining" phase - learning rate is high to learn new concepts, at the expense of "details". Once it has learned all the concepts, lodestone will lower the learning rate for finetuning phase which should improve the details IIUC
@n319k Where did you find this information? I want to learn all about this.
@n319k Where did you gather this information? I want to learn all about how he's doing this.
@staythepath check out the hugging face depo:
https://huggingface.co/lodestones/Chroma
The new parts are uploaded as they finish! V29 is working well.
it's still in "pretraining" phase
rn it's training at 512 res so the detail wont be that good
the detail fine tuning will be reserved at the end of the epochs (around v48-50)
the reason why is because training at such high resolution is really costly (up to 4x more compute) so it's not really worth doing high res training on every epoch on a tight compute budget.
Hi! I'm absolutely in love with this model. Exactly what I needed for a long time! Do you plan to publish new versions on CivitAI?
Can i use it on GTX 1070?
Hope to buzz this one as soon as possible so anyone could use it. Wait 4 more days
Yes, V27 Q4KS, V29.5 Q5.1, and V30 Q5.0 are working for me in Forge with "ForgeChroma" mod. It's slow but it gets there.
With GGUF you should be able to.
@nunyabizness1 how much does it take to generate a image?
@A_Green_Orange which version do you prefer?
Looking forward to ver30's test pass and early release.
Just in case: published a few hours ago on huggingface.
And from my early tests, it seems to bring nice improvements, even though v29 was already pretty good.
@OliviaRossi using that myself and starting small at training levels from what I gather 512x512 ratios and upscaling with a 1 pass upscaler is speed demon for Flux Dev'ish level. At the very least better than bare bones Schnell.
Okay. Now the question becomes how to make the fp8 model with scaling factor used by comfyui
Anyone managed to get Teacache working with this model?
Has been requested, however not implemented yet, see:
https://github.com/welltop-cn/ComfyUI-TeaCache/issues/110
