PixelWave FLUX.1-schnell 04 - Apache 2.0!
Safetensor Files: 💾BF16 💾FP8 💾bnb FP4
GGUF Files: 💾Q8_0 🤗Q6_K 💾Q4_K_M
Model also available at: RunDiffusion and Runware.ai
PixelWave FLUX.1 schnell version 04 is an aesthetic fine tune of FLUX.1-schnell. The training images were hand picked to ensure the model has a bias to eye catching images, with beautiful colors, textures and lighting.
Trained on the original schnell model, so Apache 2.0 license!
No special requirements to run. Supports FLUX LoRAs
Euler Normal, 8 steps.
You can use more steps to improve finer details, but the output doesn't change much after 8 steps.
Shout out to RunDiffusion
Huge thank you to RunDiffusion (co-creators of Juggernaut) for sponsoring the compute that made training this model possible! Figuring out how to train schnell without de-distilling the model required a lot of experimenting, and being able to utilize RunDiffusion's cloud compute made it a lot easier.
For those needing API access for this model, we're partnering with Runware.ai
I have made the FLUX.1-dev 04 version exclusive to RunDiffusion and Runware for the time being. When I release version 05 in future, I plan to release the dev 04 open weights.
Grateful for their support in getting this model out there, please check them out!
Training
Training was done with kohya_ss/sd-scripts. You can find my fork of Kohya here , which also contains changes to the sd-scripts submodule, make sure you clone both.
Use the fine tuning tab. I found the best results with the pagedlion8bit optimizer which also could run on my 4090 GPU 24GB. I found other optimizers struggle to learn anything.
I have frozen the time_in, vector_in and mod/modulation parameters. This stops the 'de-distillation'.
I avoid training single blocks over 15. You can set which blocks to train in the FLUX section.
LR 5e-6 trains fast, but you have to stop after a few thousand steps as it starts to corrupt blocks and slow down learning.
You can then block merge with an earlier checkpoint, replacing the corrupt blocks, and then continue training further.
Signs of corrupt blocks: paper texture over most images, loss of background details.
Contact
For business or commercial inquiries please reach out to us at [email protected]. Licensing flux fine tunes. Customer training projects. Commercial AI development. The team can do it all!
PixelWave Flux.1-dev 03 fine tuned!
Safetensor Files: 💾BF16 💾FP8 💾NF4
GGUF Files: 💾Q8_0 🤗Q6_K 💾Q4_K_M
The 'diffusers' files are actually the Q8_0 and Q4_K_M GGUF versions. GGUF files also available on huggingface.
I fine tuned version 03 from base FLUX.1-dev for over 5 weeks on my 4090. It is able to do different art styles, photography, and anime. Trick I discovered to help with LoRAs.
I used dpmpp 2m sgm uniform 30 steps for the showcase images. If you want a neater/cleaner output, try increasing the guidance. Also mentioning a style can help, so the model doesn't have to guess.
I also recommend try adding the upscale latent by node, and scale the latent by 1.5, e.g. generating an image that is 1536x1536 instead of 1024x1024.
PixelWave Flux.1-schnell 03
GGUF Files: go to huggingface
I used dpmpp 2m sgm uniform 8 steps for the showcase images.
You can start with 4 steps, but there are less errors with anatomy if you run with more steps.
PixelWave Flux.1-dev 02
GGUF Files: 💾Q8_0 🤗Q6_K 💾Q4_K_M
Version 02 has greatly improved black and dark images, and more reliable outputs with fewer issues with hands.
I recommend using dpmpp_2s_ancestral, beta, 14 steps. Or euler, simple, 20 steps.
PixelWave 11 SDXL. A general purpose fine tuned model. Great for art and photo styles.
I use 20 steps, DPM++ SDE, CFG 4 to 6 or 40 steps, 2M SDE Karras
Accelerated Version - 5+ Steps, DPM++ SDE Karras, 2.5 CFG
PAG Recommended⚡Recommend 1.5 Scale, with CFG 3. Link to workflow
⭐Link to prompting guide.⭐ You don't need to use 'quality' terms such as 4K, 8K, masterpiece, high def, high quality, etc. Unless you want it, I recommend not using words such as 'vibrant, intense, bright, high contrast, neon, dramatic' for photographic styles if you a wanting a more natural look. This can cause images to look 'overcooked', but it's just the CLIP following your prompt. 🙂 If you do want vibrant, neon photos PixelWave will provide!
The focus for version 10 was to train the CLIP models, which improves the reliability, ensures you can produce a wide variety of styles, and better at following prompts.
Thanks to my friends who helped test: masslevel, blink, socalguitarist, klinter, wizard whitebeard.
Guide: Upscaling Prompts with LM Studio and Mikey Nodes
Guide: Add more details to your image using the skip step method
No need for the refiner model.
This model is not a mix of other models.
I also created Mikey Nodes which contains a lot of useful nodes. You can install it through comfy manager.
Description
Fine tuned for 5 weeks on my 4090.
FAQ
Comments (73)
v0.3 GGUF HF link leads to page 404
Oops! Forgot to make it public. Should be good now. Thanks for letting me know.
@humblemikey I always glad to let everybody know about someone's fail...
[ JOKING ] [ SARCASM ]
What tooling did you use to fine-tune?
I tried to use OneTrainer to train a LoRa, but it gave me an authorization error - it tried to get smth from huggingface
I used kohya ss, but before it supported fine tuning. I had to modify the code to get it working. But I believe it supports fine tuning now if you clone the sd3 flux branch.
@humblemikey THX
@humblemikey It does, and on a 4090 its around 6 it for 1024 px.
@vigilence Yea.. that's around 3-4000$ i do not have.. I'll have to suffice with my 4060ti with 16gb vram. Might be able to train a lora on this, if I can get these tokenizer issues sorted out in onetrainer.. I tried kohya a few months back and it just smacked me into a brick wall with OOM crap, sides, I like onetrainer's simplicity. Kohya is downright intimidating if I'm being honest.
Yea, tryin to figure out onetrainer myself. I got past the authentication errors by copying the folder from the official flux model, and just tossing the nf4 model in there (in place of the full-size model, which would take millionaire hardware to train), although now I'm at an impasse with the tokenizer. It expects a tokenizer for this thing, which I do not have, since that didn't come with it.. ChatGPT tells me I could train a tokenizer for it, but that just sounds like a bit much headache..
Hi, The model with lora generates with artifacts. Have you tested it with any lora?
Just tested with Mac the Alien and Mary Poppins LoRAs. Had to turn the strength down to 0.8 to reduce the grey fuzzy look.
@humblemikey Ok, thanks. I tried I still have artifacts, well ok, for lore will be other models.
@humblemikey I tried this but it doesn't do much on my setup. I guess I'm doing something wrong
Looks awesome! Great work Mikey!
Your PixelWave Flux.1-dev 02 model is currently the best Flux checkpoint I have tested, I can't wait to see what you have cooked with v03
This model lost a lot of detail compaed to stock flux, its like it was trained on jpg movie stills and i hope it wasnt, maybe training res was 512 but i trained on low res and lora still came out ok, maybe its overtrained , the degradation is just too much for me, like about 70% of flux stock sharpness, thats no good.maybe merge with stock would help it.
It was trained on 1024 res images. I'm not seeing the loss of sharpness you are claiming in the images I have generated and posted, and what other people have been sharing.
@humblemikey compare same prompt with stock dev and this model , there is detail drop,stock flux is pretty crispy but this one kinda jpegy
@niccc could you provide examples?
talk about entitlement , too much for you don't use it . truly entitled , the dude spent 5 weeks on his local gpu to give something , and you come here like you paid for his bills ,rig and effort . my god ...
I've also noticed rather serious quality degradation if using default Flux-dev settings (Euler, 20 steps, Normal scheduler) and result does look like an overcompressed JPEG. interestingly enough generation is also faster than vanilla Flux-dev.
however, switching to SGM Uniform scheduler brings quality back to normal on all samplers I've tried.
not sure about op, i'm using latest SwarmUI and Q8 model variant.
@randomkcoala3 thanks for sharing your findings.
@caki dood just cause i pointed out that photo style is not as sharp as stock does not mean the model is trash, are you crazy ? Is it just black or white with you.
@randomkcoala3ill try it, i used more steps and quality improve but still its not as crisp as stock flux
dev03 Q8 gguf is corrupted. : (
Did you unzip it first? I just tested it on my end and it worked fine.
@humblemikey Did you download it from civitai? Because we're all downloading it and trying to unzip it and getting errors when unzipping it.
@psspsspsspssspss you do realise you responded to the author lol... geez this site and its users I swear.
It looks like it corrupted the file when I uploaded it. When I extract the file I zipped, I get no errors. Going to upload again now. You can also grab it from https://huggingface.co/mikeyandfriends/PixelWave_FLUX.1-dev_03/blob/main/pixelwave_flux1_dev_Q8_0_03.gguf
Finished re-uploading the file. Tested downloading and extracted with no errors.
@humblemikey Thank you, now it works fine. : )
@axicec The irony is strong in this one...
When on tensor?
7zip complains about Q8 .zip while extracting, are you sure it's not corrupted?
It looks like it corrupted the file when I uploaded it. When I extract the file I zipped, I get no errors. Going to upload again now. You can also grab it from https://huggingface.co/mikeyandfriends/PixelWave_FLUX.1-dev_03/blob/main/pixelwave_flux1_dev_Q8_0_03.gguf
Finished re-uploading the file. Tested downloading and extracted with no errors.
@humblemikey thanks!
The Q8 zip is corrupt
"unexpected end of input" error in 7z
Can you upload non-zipped?
It looks like it corrupted the file when I uploaded it. When I extract the file I zipped, I get no errors. Going to upload again now. You can also grab it from https://huggingface.co/mikeyandfriends/PixelWave_FLUX.1-dev_03/blob/main/pixelwave_flux1_dev_Q8_0_03.gguf
They don't let you upload a non zipped gguf file.
Finished re-uploading the file. Tested downloading and extracted with no errors.
Somehow LoRas don't really work with this model (v3). They output a grey mess. Anyone knows why? Dialing the weight down doesn't work
Same here, I can't get it to work with LoRAs for whatever reason, so it's useless to me.
Any chance we could get a GGUF version?
Not sure if they had been there before, but the author added GGUF links in the description
Is it possible for us to further finetune your model with our own concepts? And any possible benefit to training LORAs on your model vs vanilla flux?
Like others I get overcooked, blurry images whenever I use a lora. Sampler doesn't seem to matter. Anyone have a fix for this or is this just a matter of loras not working with this at all since it's a fine tune or something?
same problem with Loras, otherwise it's amazing when using it stand alone
Same here, I can't use it with LoRAs either, I thought it was just me, but it's basically useless without LoRAs.
Great fnetune thats better than base model, well done for unlocking flux. i'm glad it can do various artistic styles that flux cant, however i just really hoped it recognized artists and photographers. movies etc lke sdxl does. this what gives aesthetic edge to any model and truly unlocks artistic expression and experimentation.
Also loras dont work with this model so at the end its sort of a 2steps forward, 1 step backward thing cause now i have to use base flux and lora if i wanna use a certain concept artist/photographer style. mahn we were so close, i just dnt like all this jumping back and worth workflows. Ultimately, seems like sticking to sdxl and its billion finetunes and stuff is still the more efficient way to go.
don't forget an update to our lil buddy schnell :)
nice models
Mind sharing the captioning / training details? This is by far the best Flux finetune I've ever seen.
I would love to know the training config/method as well. I have been trying on and off with my 4090 but struggling to get good results. I saw limiting it to one concept/batch of images at a time so was going to look into that aspect. Would love more information though on the settings being used (and which platform).
How do I get this to work in SwarmUI? It just errors out when trying to load the model. Using the FP8 safetensors file
I would also like to know how to load it in SwarmUI, I would like to try this model out.
over 5 weeks on my 4090 ! That is some effort. Thank you for creating this model
Tried it in Forge - but no matter what art style I give in the prompt, it only creates IMPRESSIONIST images. Any solution?
I had the model downloaded in Stbility Matrix and it has been placed inthe UUNET folder, instead of the normale StableDiffusion folder. Is this deliberate?
Which exact model did you download? I'm about to try the Fp8 safetensors version in Forge, so it would be good to know if that's the issue, or if you're using one of the GGUF's. I'll let you know how I get on.
@wideload I am using the FP8 safetensor in Forge and it is working superbly. With HiRes Fix I have found Latent (antialiasing) as the upscaler is giving the most pleasing results.
@charidot113 I agree. Just tried the same and the results were spot on. Tested photo, watercolour and it's just finishing a pencil sketch. All three exactly as I would expect.
@wideload
I used the one which is 11GB large ... gives me a lot of incoherent results - sometimes it works better if I increase the CFG (which you are not supposed to with FLUX, are you?), sometimes it starts to burn it with CFG > 3 already ... And why it struggles with text and hands ... ?? I am quite curious what your experiences will be. Hope you're going to share ... At the moment I am still reluctant to trade the benefits from this FLUX art model which on my 4090 still takes 45 seconds for an image against SDXL which is more than 4 times faster ...
to me specifically adding 'photo' or a 'movie still' etc. helped
@sunflower96733 I'm running it in Forge and it's producing fantastic results. I put it in the standard checkpoint folder, have you tried moving it from the UNET folder? I'm also using the FP8 safetensors model from the description which is 11.1Gb. It's worked with Euler Simple, DPM++ 2M SGM Uniform and the Forge Realistic samplers so far. CFG was 1 and Distilled CFG was 3.5 with 20-30 steps. Resolution was often 896x1152 but also higher. Hands have been perfect every time for me and the only issue was using loras of people. For them I had to drop the lora value down to 0.9 or it appeared blurry and pixelated. Multiple styles have been tried, including the three mentioned above and they all worked very well.
I don't use or really know much about Stability Matrix so I don't know if that's an issue. If you have access to Forge or ComfyUI (outside of SM) it might be worth trying them as the model obviously isn't at fault.
@wideload
I already got the confirmation that other people also have problems when using it in Forge. Even if I copy the same parameters set from Civitai the results something are very similar, and sometimes not.
What VAE/Text Encoders did you use? And is their sequence of any influence?
@sunflower96733 I use ae.safetensors, t5xxl_fp16.safetensors and ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF.safetensors. Those are my standards and I just tried them in two different sequences for you and it changed nothing.
The only other setting I haven't mentioned is Diffusion In Low Bits is set to Automatic (fp16 lora). Beyond that, I'm drawing a blank on what it could be.
@wideload
Are those encoders here really make a difference?
https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14/tree/main
And which one did you chose and why?
I am currently using the t5xxl_fp16.safetensors, or the fp8 version - but with my motives this doesn't seem to make a big difference ...
By the way: does emphasizing with this syntax (hat:1.5) also work in FLUX? It didn't seem to do much when I tested it with this model here ...
Did you also check text generation? If I leave CFG to 1 text is practically unusable. Only begins to make sense with CFG 3 or above, also massively depending. But even if the text is almost right, it still struggles with names. Can't spell 'Cathy' - makes at least 'Cathah' out of it.
@sunflower96733 I don't know how much of a difference it makes. I chose it because a video I saw recommended it (by Olivio Sarikas I think) and it hasn't done anything bad so I stuck with it.
The one I have appears to be about half way down that list, it has the exact same name as I posted and it's 323mb. It goes in Forge's model/text_encoder folder.
I also tried the fp8 version of t5xxl and it made no difference.
Changing the value with that syntax sometimes works and sometimes not IMO. Or it's effect is weaker than the SD models.
@sunflower96733 I only did a couple of tests with text and just like the base Flux it isn't perfect every time but it still works well most of the time. I usually find if the first generation doesn't work, the second does but I'll do some more testing on it when I get chance.
Are you still only getting 'impressionist' style images or have things improved since your original post?
@wideload
The issue with the model not respecting well known art styles as good as expected I could improve with setting a higher CFG value. But this can only go so far.
@sunflower96733 I don't know then. I'm not negating your experiences but I haven't experienced a single one of the issues you have and I haven't changed the CFG once. It's working fantastically for me, with the exception of some character loras I tried.
Maybe it's the particular art styles you are trying to recreate, it could be anything at this point if your settings are all correct.
@wideload
Yes could depend on my art style and content of the prompt. The interesting thing is, that this model here doesn't have the exact same biases against NSFW e.g. as the original has. I will make an in-depth test with some styles and artist names - but not being able or recommended to up the CFG scale is of course a severe limitation of FLUX, since there is no other parameter to compensate for this. If I want my image to be 'cubist' of course there are various levels of cubism. Many users seem to stick too much with 'realism' and don't venture deep enough into abstract fields to even notice this...
And what about the (shoes:1.5) way of upping things? Did you test this?
@sunflower96733 You should be able to change the Distilled CFG in Forge (default is 3.5). I believe that's what you change for Flux but I've never needed to do it. Don't take this as fact though.
Yes, NSFW is much better with this model. It also does fangs out of the box without lora. I tried a vampire seeing as it's spooky month. Haven't tried blood and gore yet.
I had mentioned changing values but I guess it was easily overlooked as it was at the bottom of the comment. Changing the syntax values sometimes appears to work and sometimes doesn't. It may be that it's effect is weaker than with the SD models. I'm not too sure to be honest but I tend to prompt for exactly what I want with Flux as it seems to work better.
I'm one of those realism people. Sometimes I'll do sketches, watercolours and a few others but never something like cubism. It's not my taste. Of course you could train a lora yourself and merge it with Pixelwave, I think that's possible.
@wideload
I increasingly get the impression, that for my take on AI art, FLUX has more negatives that outweigh the positives. Realism is totally not my thing - to my opinion, we should leave this to the photographers as long as possible. And everything FLUX is said to do better than SDXL is not reliable (prompt adherence, text rendering, good hands), can be easily fixed in SDXL with outpainting or regional prompting - or has massive downsides that come along with it (only half baked NSFW, if any). And the rendering takes MUCH to long, even on my gaming laptop 4090 with 16GB VRAM.
Details
Files
pixelwave_flux1Dev03.safetensors
Mirrors
Available On (7 platforms)
Same model published on other platforms. May have additional downloads or version variants.



















