
Hey everyone,
A while back, I posted about Chroma, my work-in-progress, open-source foundational model. I got a ton of great feedback, and I'm excited to announce that the base model training is finally complete, and the whole family of models is now ready for you to use!
A quick refresher on the promise here: these are true base models.
I haven't done any aesthetic tuning or used post-training stuff like DPO. They are raw, powerful, and designed to be the perfect, neutral starting point for you to fine-tune. We did the heavy lifting so you don't have to.
And by heavy lifting, I mean about 105,000 H100 hours of compute. All that GPU time went into packing these models with a massive data distribution, which should make fine-tuning on top of them a breeze.
As promised, everything is fully Apache 2.0 licensed—no gatekeeping.
TL;DR:
Release branch:
Chroma1-Base: This is the core 512x512 model. It's a solid, all-around foundation for pretty much any creative project. You might want to use this one if you’re planning to fine-tune it for longer and then only train high res at the end of the epochs to make it converge faster.
Chroma1-HD: This is the high-res fine-tune of the Chroma1-Base at a 1024x1024 resolution. If you're looking to do a quick fine-tune or LoRA for high-res, this is your starting point.
Research Branch:
Chroma1-Flash: A fine-tuned version of the Chroma1-Base I made to find the best way to make these flow matching models faster. This is technically an experimental result to figure out how to train a fast model without utilizing any GAN-based training. The delta weights can be applied to any Chroma version to make it faster (just make sure to adjust the strength).
Chroma1-Radiance [WIP]: A radical tuned version of the Chroma1-Base where the model is now a pixel space model which technically should not suffer from the VAE compression artifacts.
Quantization options
Alternative option: FP8 Scaled Quant (Format used by ComfyUI with possible inference speed increase)
Alternative option: GGUF Quantized (You will need to install ComfyUI-GGUF custom node)
Special Thanks
A massive thank you to the supporters who make this project possible.
Anonymous donor whose incredible generosity funded the pretraining run and data collections. Your support has been transformative for open-source AI.
Fictional.ai for their fantastic support and for helping push the boundaries of open-source AI.
Support this project!
https://ko-fi.com/lodestonerock/
BTC address: bc1qahn97gm03csxeqs7f4avdwecahdj4mcp9dytnj
ETH address: 0x679C0C419E949d8f3515a255cE675A1c4D92A3d7
my discord: discord.gg/SQVcWVbqKx
Description
FAQ
Comments (130)
So what's new in v30?
Incoherence? My workflow that's been working great with 20 through 29, is giving messed up features and hands with 30. Edit: maybe I was too harsh. Did some same seed tests with 29 and it didn't do so well with those aspects on those seeds either.
They are doing 50 epochs of their dataset. Each version is one epoch.
They changed the batch size used during the training in between v27 and v30, which should result in less steps needed.
The model also had more training time, thus should be better overall.
@A_Green_Orange yeah I switched to the lower 30 steps down from 50 and am getting decent results. Was working on guy riding a large animal pics and a lot of them even with their official new workflow were having the guy's head sticking out of the animal's torso.
@floopers966 Still using 27, waiting for the epoch that is actually a step forward...
Guys, trying using the dpm_2_ancestral sampler. I'm getting good results at 40 steps. I went into more detail in a different comment, but I think ancestral samplers are the way to go with this model in particular.
@CrownVic07 thanks you bit what about the scheduler?
@alternative_Universe beta seems to work the best for me so far
@CrownVic07 any tips for the deformities?
@CrownVic07 way too slow
@nikolatesla20145 Try using the OptimalStepsScheduler node to speed things up. I was able to get good results in under 1 minute with dpm_2_ancestral at 20 steps and about 2 minutes total if I wanted to upscale further. I posted some of my images with their workflows, but they're stuck in moderation limbo. I didn't realize how seriously they're taking the nsfw censorship stuff.
I still haven't got a controlnet to work with Chroma. Any ideas?
well chroma is too architecturrally different to probably be compatible with anything that exists currently, you will need to wait for someone to make controlnets specifically for it, which is a ways off.
They need to be trained for chroma.
I got it to work with the shakkerlabs controlnets. You need to use the dualclip loader and also load clip I.
Great work! Hope to see a Hi-Dream Chroma
any reason why you prefer hi-dream as a base compared to flux?
hi dream is huge and the improvement seems marginal from flux in my opinion
@Lodestone I really like the aesthetics and prompt following in comparison to flux. Thankfully, I have hardware to facilitate the dev and fast Hi-Dream models
@Lodestone I prefer the aesthetics and prompt following of hidream in comparison to flux. Thankfully, the size of the model is not an issue for my use case :)
@foggyghost0 model size is a big problem in hidream you will likely not seen any training on that
People who spam pictures like KKAANN does here, of that type are exactly why CivitAI is where CivitAI is. Thanks KKAN, you and your ilk are to thank for everything everyone stresses about.
Bro chill
I like Chroma, albeit a bit slow on the 5060-Ti, even with a stable OC. What is a good setting that's a fine balance between "quality" and speed in sampler/scheduler settings? CFG/Steps if also available as a suggestion?
As of v29.5 it should be possible to use 20-30 steps for decent results. I used to have to go for 50-60 earlier.
Currently the training is more about concepts than high resolution detail, so at times 512x512 yields a more coherent image.
I usually fall back to DEIS/DDIM, but sometimes Euler with different schedulers yields interesting results, Normal especially. CFG seems a bit scared of changes, 4-5 is OK, falls apart under 3.5. Maybe it's only in my Forge setup, but the CFG range for good quality is tiny.
What kind of speeds are you getting on an RTX 5060 Ti, and are you using things like TeaCache and SageAttention?
Awesome work, dude!
As a PSA to everyone, I've found the dpm_2_ancestral sampler (paired with beta scheduler) does wonders for the latest version. I have a theory that ancestral samplers are optimal for this model since they never truly converge. A sampler that converges might accidentally converge on a mistake. Never converging gives the model infinite opportunities to fix its mistakes. This is especially important for a model that utilizes cfg and negative prompting. The whole point of a negative prompt is that you are telling it what mistakes to watch out for so it can fix them.
I've also found that this model has an easier time with image composition at lower resolutions. So my initial gen will be at something like 1024x1024 and 40 steps. Then I'll upscale to 1440x1440 and give it another 20 steps at 0.45 or 0.55 denoise.
@CrownVic07 Yeah it took me a while to settle on a new workflow for v30 but it does seem to be working out now with that sampler. Example: https://civitai.com/images/77352331
@floopers966 Yeah, man. It's actually getting more and more impressive the more I play with that sampler/scheduler combo. Try using standard Flux loras too. Surprisingly, they seem to work better with Chroma than they do with Flux. I suspect this is because the whole "Flux guidance scale" thing doesn't play nicely with loras. I'll post some of my gens in a minute so you can get an idea.
@floopers966 Once Chroma training scripts are implemented, then we'll really be cooking with fire.
dpm_2_ancestral sampler is two times slower because it does 4x inference per step compared to euler that does 2x, so there is no magic
@Randmeist If you compare it to dpm_2, which takes the same amount of time, the ancestral version is clearly superior.
Recommend this with Flan T5 XXL TE too instead of the standard one!
@MeMakeStuff Could you post a download link?
what cfg scale do you use?
@OrangeJuiceAlien 4-4.5 has been working well for me. I'll post some of my gens with the workflow attached in a little bit.
Also take even more forever to render.
Is there any way to get Chroma working in Forge webUI? Things like inpainting are more of a hassle in ComfyUI.
@tapczan and a reminder for users to go over this issue to enable proper LORA support:
https://github.com/croquelois/forgeChroma/issues/4
step-by-step instructions for the dummies would be useful. I'm looking at this and don't know what to do with it
"you may want to force Forge to be in the same version than mine git checkout 0ced1d0cd000a536ebd21dc2c8e8636c9104568d
inside your forge root directory, apply the patch: git apply forge.patch it will modify backend/loader.py, backend/condition.py and backend/text_processing/t5_engine.py
add huggingface/Chroma directory inside backend/huggingface
add diffusion_engine/chroma.py file inside backend/diffusion_engine
add nn/chroma.py file inside backend/nn"
@Oppkllll If you installed Git to begin with, to run these commands, then you just need to open a command prompt in the directory where webui_user.bat is located.
Download the files and place forge.patch in that directory. Run the command. If all goes well you will not see any errors pop up, but there's no direct confirmation either.
Place the other files as instructed.
There may be updates for that patch in the future, so do not delete forge.patch. You will need to run 'git apply -R forge.patch', then replace the patch file with the new one and then 'git apply forge.patch' again. Otherwise I've seen it fail to update.
Hopefully this will all be integrated natively in forge in the near future.
I'm unable to train on this in Kohyaa, is there something in particular I need to do?
when sdxl appeared i was disappointed. when flux appeared i was indifferent. but when THIS model appeared I installed comfy finaly. And I hate comfy. But I see, that this model really has potential. Really quality UNCENSORED potential if you know what I mean...
If you stop to think about it, it's even dangerous, but there's no denying how incredible it is. I can only imagine that in the future, when the next versions come out, it will be as incredible and SCARY creative as it already is.
hidden gem since sd 1.5 models
This better than Midjourney!
Poor quality, none of the examples even run
try dpm_2_ancestral sampler and beta scheduler.
@nsleptsov18954 Copied examples (like the cat eye one) and they have so much bizarre stuff in them that comfy gags on. No patience to figure that out. 2-3 different pointers to the model and odd *.pth stuff
Consider adding danbooru/e621 artist names to the prompt words in the training set to obtain more diverse styles?I tried this model and it's great, but it's too one-dimensional and prompts for artist names or specific styles don't work for it(when generating anime images). Adding danbooru/e621 artist names can create more diverse content or even create entirely new styles, like NovelAI's anime model/Illustrious/NoobAI
As far as i know, artists names are in the dataset, but the model hasn't learned them yet. How many epochs it'll take is anyones guess
I think something I've realized from transferring prompts is how redundant and often contradictory my past prompts would be. Often times going through an old prompt and simply cutting a lot of the style cruft has made a much better attempt.
There are quality tags in the model. You can use them with aesthetic #. It goes from aesthetic 11 to aesthetic 0. aesthetic 10-0 are based on e621 scores per month, with aesthetic 10 based on the highest in each month, and aesthetic 0 being the lowest. aesthetic 11 is a quality tag that was curated by the model maker so expect it to work a little differently.
But where did you see this?, can't find that in his examples
@alternative_Universe from the model maker on discord
Trying to use with ForgeUI and after following the directions, it does generate images but they are completely blurry.
Anyone know the fix?
Try using different samplers. Euler Simple/Beta at 20-25 steps should be good sanity checks. Keep distilled CFG disabled, and CFG above 2.5.
Unexpected blurriness with Flux Dev and Schnell was either due to an incorrect setting for "Diffusion in Low Bits" or a CFG/DistilledCFG that was too high. Maybe it's the same here, since it's based on Flux Schnell.
This model is gonna be real nice, when it matures, gets funny things like SPO adapted to it, someone makes use of the IterComp reward models to help it out, silly things like that.
Oh, and of course, having TDD ( https://github.com/RedAIGC/Target-Driven-Distillation ) trained for it, as it already kinda sorta works on the existing model.
That SVDQuant thingymabob also sounds nice.
but wait, do I really expect the age of vibe coders who can't do jack to accomplish any of this?
I guess optimism is a curse sometimes.
Note to people new to the model:
It is still in the earlier phase of the training (512x512) that will eventually scaled up to higher res -piggybacking on that generalization learned from extensive low-resolution training.
love this model, love the variety that is present in this model i am using the gguf for this, just one question anyway to speed it op a bit? Right now it takes about 90s per image, can we use sage attention and teacache or something to speed it up anyone tried? or does flux turbo work here?
Try using a turbo/hyper/lowstep lora - or a combination of those - see:
https://huggingface.co/silveroxides/Chroma-LoRA-Experiments/tree/main
Or use fp8-scaled version:
https://huggingface.co/Clybius/Chroma-fp8-scaled
@EliteLensCraft i am using a 3090 so fp8 acceleration is not there that is only for 4000+ cards, i will try the turbo loras thanks
this is my findings you can use hyper chroma 8 steps with weight of 0.14 ONLY this will allow it to generate very fast
This model right here is the future. The prompt comprehension and understanding of complex scene composition is already amazing.
I can’t wait for the fully trained model. Keep it up!
I've noticed an issue with realistic feral anthro that it tends to make the characters have overly large heads instead of human sized heads. If I add anthro or furry to the prompt it makes them cartoony looking instead of realistic.
https://civitai.com/images/78853742 this is pretty impressive
Not really a quirk specifically with Chroma (since it's also present in flux originally), but the model has difficulty with the backs of pointy ears (elf, goblin, etc). It always flips them forward because it doesn't seem to have enough training on pointy ears from behind. I just noticed that's still the case in v32.
Update: After testing it at 512x512, it can /kind of/ do ears correctly sometimes. Seems like it's 50/50 for if it gets flipped at that resolution. (StreamofStars mentioned the training was still at 512x512)
"it's 50/50 for if it gets flipped at that resolution"
I get this a lot, regardless of ears of not. Feet and knees do it alot for me though. I'm hoping this is a training issue. It clearly knows both exists, just needs the time to know which to choose. I have prompted some of it away, but my negatives can grow a bit too much I feel.
Is it possible to run this model on SB Forge? I can't get anything to work.
"AssertionError: You do not have CLIP state dict!
You do not have CLIP state dict!"
First of all apply this patch and additional files to Forge to support Chroma:
https://github.com/croquelois/forgeChroma
also go over to their closed issues, there are some instruction to enable LORA support, otherwise Forge will just not load Flux LORA with Chroma
Chroma doesn't need to load CLIP at all, only t5xxl (take a look at flan too https://huggingface.co/easygoing0114/flan-t5-xxl-fused) and the flux VAE
@ailu91 thanks for the reply! So it's not going to work out. It's all too complicated for me.
I'm in the same boat. I'm willing to learn something new - could someone point me at specific directions for installation? I'm not sure where to start on Google.
@charlieaudino @ilya808rolf994 all this procedure requires is to install Git, in order to run the patch command, and to move some files manually.
If you installed Forge through "git clone"- you already have it and just need to follow the instructions.
If you installed the Forge zip package, install git https://git-scm.com/downloads/win.
-Open a command prompt window in the folder that contains webui-user.bat (you can write cmd in the address bar and it will open set up for that same folder properly), and follow step 1 from forgeChroma (skip step 0, it's not truly required, maybe only in some niche cases. Updating Forge is good enough).
If the patch gets applied you will most likely just see an empty line, this is usually good. If you get some errors- not good. There's just no direct confirmation when the patch is applied successfully, so don't be surprised.
-Steps 2-4 require manually placing some files. Downloading the files from forgeChroma can be done manually from their git web page, or by pointing command prompt to some other folder, and running the command "git clone https://github.com/croquelois/forgeChroma" which will clone all the files from there to your PC (then you can move them as instructed).
Using Chroma- Set Forge to Flux, or All mode
-Use the regular CFG at about 4-6, not distilled CFG (set this to 0).
-load t5xxl and the Flux VAE (no need for clip L or G), as described here:
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050
Also refer to instructions on Huggingface https://huggingface.co/lodestones/Chroma
Low resolution is recommended initially, it's not doing too well above 768 but hires fix helps. Use Euler Simple as a start, but other samplers also work (DEIS-DDIM for example)
godspeed
@ailu91 well, striclty speaking, you don't need to install git: you can just manually change required lines in .py files by copying them from forge.patch. That's how I managed to get it working.
@aleksds1 while definitely possible, it gets tedious after a while. That patch has gone through several versions. As do many other releases on Git. Git handles the changes automatically, you'll get real tired real fast if you were to change bits of code manually for every update, a very big waste of time.
@ailu91 thank you for the time taken to make this post. I tried to install patch and get "error: can't open patch 'forge.patch': No such file or directory"
@charlieaudino I solved all the problems here - you can download everything here https://github.com/maybleMyers/chromaforge
Chroma Tips.
(Last edit: D28/M07/Y25, epoch v47).
Realism prompt 1:
A close up photograph taken on a Kodak Portra 400 55mm analog camera. Captured by a Sony camera, 85mm lens, 50mm lens. high quality photography. candid amateur photography. film photography, film. realistic. clear details. film grain. cosplay.
Realism prompt 2:
Captured by a Sony camera, 85mm lens, 50mm lens, high quality photography, candid amateur photography, film photography, film, realistic, clear details, cosplay.
Realism prompt 3:
A professional RAW and ultra detailed DSLR photograph, captured by a Sony camera, 85mm lens, 50mm lens, high quality photography, candid amateur photography, film photography, film, realistic, clear details, cosplay, IMG_20170314.DCIM.
Realism prompt 4:
Captured by a Sony Alpha a7S, Leica M5, Canon EOS R5, Kodak Portra 400, 85mm lens, 50mm lens, high quality photography, candid amateur photography, film photography, film, realistic, clear details, cosplay.
Realism prompt 5:
A real life candid snapshot, taken with a iPhone camera, amateur photo, raw image, captured by a Sony Alpha a7S, Leica M5, Canon EOS R5, Kodak Portra 400, 85mm lens, 50mm lens, high quality photography, candid amateur photography, film photography, film, realistic, photorealistic, clear details, cosplay.
Realism prompt 6:
The overall photographic style is sharp and clear, rugged and ultra detailed, emphasising meticulous detail in the hair and skin, clothing textures, and facial contours, with professional lighting accentuating her form. Captured by a Sony camera, 85mm lens, 50mm lens, high quality photography, candid amateur photography, film photography, film, realistic, clear details, cosplay, IMG_28736532.DCIM. A professional provocative raw (unedited) and ultra detailed UHD24 DSLR photograph captures.
2.5d prompt:
aesthetic 10, 85mm lens, 50mm lens, high quality photography, realistic, cosplay.
Realism negative prompt:
sketch, drawing, illustration, painting, art, cartoon, anime, 2d, 2.5d, unreal engine, render, CGI, fake, low resolution, low quality, low detail, pixelated, image noise, bokeh, blur, blurry, blurry background.
Using a higher CFG results in better anatomy, coherency, prompt adherence, background detail and a lot less failed generations. Experiment with a CFG from 4 to 6.
Some tags / language can heavily bias the model towards digital art, so be careful.
Note that what works well with current epochs may not work well with future epochs.
When you say current epoch,are you talking about v0.32?
@alternative_Universe Yes
On tags: I'd recommend only using a few of the specified tags, and trying to work them into real sentences.
On sampling and scheduling:
A nonexhaustive list of the combinations that sample fastest and most realistically compared to other samplers -- using different combinations can up to double sampling time. Used at CFG 3-5 at 25 steps.
Listed from approximately more real to more creative:
- res-multistep sampling + beta scheduler (more realistic)
- huen sampling + beta scheduler
- uni-pc + simple/beta scheduling
- euler + normal/simple/beta scheduling
- ddim + ddim-uniform (more creative)
I used the combination of euler + simple for most of my beginning prompts, which were good, but often overly creative or artistic. They required more prompting to get around the issue. The other combinations have been less of a hassle, but can take longer if the wrong scheduler is used. Don't use the "SDE" labeled schedulers, they'll just be noise(y). The CFG or ancestoral variants could be tried, and may be more stable under higher CFG and steps, respectively.
This model is neat.
@makiaevelio543 I would definitely try and move away from using so many negatives. They actually do more harm than good depending on the desired output and completely destroy certain styles that the model is otherwise very capable of. They can also hurt prompt adherence. I did try running some gens with the negs you posted and the style I asked for got completely obliterated and prompt adherence went down, whereas with a simple "blurry, bad quality, 3D, render", I got exactly what I wanted. Just my two cents.
@DomDomTomTom I agree, that long negative used to work well on previous epochs, but on the latest epochs its become a liability. I've updated to a shorter a negative.
how do I like get a specific character like I saw an image of leon and ashley from re4 but when i tried to get kratos or doom it doesnt get the character I do they use some sort of lora do flux lora even work on chroma ?
Chroma does know some characters, seems to draw them better when their appearance is described and not just a name mentioned. LORA made for Flux also works (for Forge there's a need to remove a bit of code that doesn't like Chroma's differences to Flux and blocks LORA because of that. it's not intentional, just a number that was too small)
Feels like using SD1 again for the first time... and I absolutely mean that as a compliment. The sheer width of the model forces some specificity in prompting, but that amount of control it offers feels like I get to learn all over again -- with a much more powerful base. Can't wait for further epochs. Sometime around 40 might be insane.
Hey amazing work on this. I am using V33, FP8 Quant what is the recommended CFG and scheduler for photo realism? Right now I am going with CFG 5, Beta and 25 steps things are great but I feel it can be dialed in a bit more just not sure what exactly to tweak. Thanks for this!!!
@silaslin Dependencies are not installes, an import error is displayed in the manager when trying to install.
res_multistep has worked the best for me, it's default
That's what he means by "refer". I don't know what "bong tangent" is, but I've been using what's called "beta". Cfg 4 is good too.
The workflow is doing some interesting things with locale, it's starting with a mask and going for 10 steps, then 30 on the whole thing.
From another comment:
Listed from approximately more real to more creative:
- res-multistep sampling + beta scheduler (more realistic)
- huen sampling + beta scheduler (heun just seems to take long, but is most accurate)
- uni-pc + simple/beta scheduling
- euler + normal/simple/beta scheduling
- ddim + ddim-uniform (more creative)
I used the combination of euler + simple for most of my beginning prompts, which were good, but often overly creative or artistic. They required more prompting to get around the issue. The other combinations have been less of a hassle, but can take longer if the wrong scheduler is used. Don't use the "SDE" labeled schedulers, they'll just be noise(y). The CFG or ancestoral variants could be tried, and may be more stable under higher CFG and steps, respectively. However sometimes they just make black images :)
Also a fun prompt trick: "C:\Users\John\Images\IMG_{3-5$$$$1|2|3|4|5|6|7|8|9|0}.HEIC"
@silaslin also thanks for this reference, really cool github project. I had no idea the RESM samplers would work on WAN, too. If I try a few and its as much better as the example, I'm going to be (un)happy
@makiaevelio543 Hey amazing info thank you very much, cool about the WAN tidbit too!
It seems he has initiated the high-resolution [1024] training run, as previously planned to start from epoch 48. Hopefully, the final model will be ready soon.
it's a test run and i will keep uploading 1024 version as long there's compute for it
Chroma is what I always dreamed Flux would be, but never was.
This is my experiment workflow
Extremely fast and ...
https://github.com/lin-silas/workflow/blob/c82132c6b29d1b75d577223421f5be16b0e6f1dd/chroma-experiment.png
Including these components:
Hyper-Chroma-Turbo-Alpha-16steps-lora.safetensors
RES4LYF
sageattention
RES4LYF breaks my Chroma workflows, even after I uninstall it:
Failed to validate prompt for output 145:
* BasicScheduler 22:
- Return type mismatch between linked nodes: scheduler, received_type(['normal', 'karras', 'exponential', 'sgm_uniform', 'simple', 'ddim_uniform', 'beta', 'linear_quadratic', 'kl_optimal', 'bong_tangent']) mismatch input_type(['normal', 'karras', 'exponential', 'sgm_uniform', 'simple', 'ddim_uniform', 'beta', 'linear_quadratic', 'kl_optimal', 'bong_tangent', 'beta57'])
Output will be ignored
Failed to validate prompt for output 19:
How did you get the TorchCompileModels node to work?
Throws me "Failed to compile model. Verify that this is a Flux, SD3.5, HiDream, WAN, or Aura model!"
@EliteLensCraft
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
you can install these to enhance speed:
pip3 install sageattention xformers
@silaslin One may need to change that torchversion depending on the CUDA installed on their computer -- although I think Comfy itself makes sure to install some version. https://pytorch.org/
No promises this works on Windows, too. It seems to, sometimes, then will break randomly.
@silaslin Thanks for the help, workflow now running - however I get either fast but blurred results with torch.compile or slow but higher quality generations. What's the magic behind it I haven't discovered yet?
@EliteLensCraft
I Think res_2m & beta57 is a good combination. Try this
https://github.com/lin-silas/workflow/blob/2c729c76a377919f4380e64e96119847e4da35be/chroma-experiment01.png
Amazing man!, Thank u very much :heart: , for me work very nice, except for i need removethis node > Model Patch Torch Settings (BAD FOR ME, very slow when enable this node)
I suspect this generator is going to be talked about a lot when it's officially released. Extremely good results. Way better results than I had with stock Flux.Dev
It's been a long time since a model left me speechless, it's seriously superior to the ordinary stuff, but anyway here's a list of styles that works very well so far (testing on v34):
3D, 2.5D, western cartoon, comics, pixel art, 3d pixel art, oil painting , surrealism , lo-fi aesthetic , concept art , line art, vector art , digital panting, illustration style , sketch style, CGI render, Rick and Morty (serie) style,The Simpsons, avatar: last Airbender, family guy, dragon ball, Naruto, jujutsu Kaisen, the Witcher 3
If I drop an image into JoyCaption and then copy the output into Chroma, I often get a gen that is a recognizable match to the original. It's possible there is some memorization, but I also think Chroma might just be that good.
There are still many anatomically strange parts. Can these be resolved by specifying prompts? I look forward to seeing what happens in the future.
Unclear relationship between this Civitai checkpoint vs. the Hugging Face Chroma project. Specifically, is the owner of the Civitai page a contributor to the Hugging Face project. Also, the Hugging Face project uses a different name, "... unlocked" for the checkpoint.
In any case, the Civitai releases appear out of date compared with the Hugging Face releases. If the Civitai project is intended to officially or unofficially track the same Hugging Face project, then please synchronize the checkpoint names, and synchronize the release versions.
Lodestones, the model's creator, has stated in the Huggingface discussions that the "unlocked" term simply refers to the fact that the model is uncensored. As far as I can tell, the models are the same between Huggingface and Civit despite the difference in naming convention. Though they understandably seem to let Civit lag behind due to their constant release cycle, which appears to be one version roughly every four days. I imagine it's really just a matter of them cutting down on extra work, for which I really can't blame them. Ultimately, the Huggingface repository seems to be the primary source, so I'd keep an eye on that rather than on Civit, at least for the time being!
He officially stated that the Civitai version lags behind the Hugging Face version because uploading files to Civitai is problematic, characterized by slow and frequently failed uploads.
@QH96 I didn't know there was an official statement, but that makes sense. Civit still has a lot of jank, so I can't throw shade at Lodestones for not wanting to slog through all the craziness just to post their latest model.
Love the model so far. I tried training a LoRA for it using AIToolkit. Tried multiple times with different parameters, but adding a LoRA introduces terrible banding to the images (using COMFY). Anyone else had this or have a solution? I have googled the topic but found very little to help.
I tried a simple character LoRA too and got the same banding. I am using v34 so I wonder if that impacts the training script Ostris has. If I use an old FLUX LoRA trained on the same subject, I get a "near likeness" without banding.
So far I made a flux Schnell lora and use use it with chroma, let's say the success was 95% accurate
Does Chroma work with some regional prompt workflows??
practically anything you can do with flux. The answer is yes. Just goodluck with any form of characters.
Very nice, but compared to flux it is much slower. ~2s per it vs 1.4 it per sec! Is that normal because of the negative prompt thingy?
one of the reason is the negative prompt, but also because it's still under training... so it's some sort of "raw" model, so it's slower... once the training is over, it will be faster.
it's because the shared version of flux is guidance distilled. While the Chroma is in training and the shared version are undistilled.
CFG models do two passes per each step, one with positive and one with negative (or empty) prompts. The result is: U + (C - U) * CFG, where U (unconditioned) is the neg/empty pass, C (conditioned) is positive pass, and CFG is literally the CFG parameter. Guidance distilled models only do one pass per step and this formula is not used, instead the guidance number is encoded and passed to the model itself. That's why they're faster but they don't support negative prompting and are less flexible.
I've found the same problem, but I can frequently fix some of the weird deformities by incrementing the cfg and the steps (craziest I've had to do was cfg 8.0 and 62 steps. I feel like that shouldn't work, but it does lol).
Incredibly responsive model! Although I could not render some things like a turned-over chair or a dripping mascara, overall responsiveness and an absence of censorship make it a very useful tool.
>Although I could not render some things like a turned-over chair or a dripping mascara,
What did she do now!?
