Soo.. what do we have here? A lora that turns dev models into 4-step. How is it different from https://civarchive.com/models/678829?modelVersionId=759853 ?
I extracted only the single_blocks instead of both. It should mess with guidance/text a little bit less as those should be untouched. Might give you a little bit less of the schnell effect. Play with the strength. Let me know. Works great with the AYS scheduler.
Description
This version is float32 if your card doesn't support BF16 it's better to have this convert to FP16 than BF16. Less precision loss that way.
FAQ
Comments (39)
Great LoRA to get the speed up without the usual decrease in detail and increase in glossiness Schnell models suffer from!
Specialized my Flux workflow on this because of the good results.
How did you use Flux with the AYS scheduler btw? The only settings in the Comfy node are for SD1, SDXL and SVD, none of which working with Flux of course. Did you manual enter the sigmas?
This node pack has it: https://github.com/pamparamm/ComfyUI-ppm among other things. They link to the original implementation too.
@Gore_Man Going to check it out. Thank you!
AYS is available in Forge, but I have never managed to get any good images out of that scheduler. Not with this lora either
Hmmm from testing it actually works wonders with the merged version of both Schnell and Dev, 8 steps and quality is up there with Dev FP8, if anyone wants to try.
should i use dev 8 step lora with this will be better ?
Works great! this needs more attention, I was lucky I stumbled upon this from a comment on Reddit
Hi this works really well - it even improves some Loras! Probably because it rectifies some of the blocks.
I'd love to play around with this. How did you extract the lora?
kohya's scripts with modded code to only do the layers I want. Watch out, it has a problem with re-saved unet from comfy. probably the state dict keys aren't named correctly as the lora don't load and you can miss it without the verbose parameter.
FP32 can work with FORGE?
Both of these have a .sft extension in the filename, and neither even show up in the LORA list. Do I need to rename the file extension to safetensors or something? Using forge btw.
worth a try. i'm using comfy.
This is amazing I can't go back to 20 steps, works perfectly with other lora.
伙计们,真的好用,强烈推荐!
Why does the generated image has a noticeable squares/artifact? Is there a way to prevent that.
add more steps or use a different scheduler/sampler
If you're using Forge and the files you placed in the LoRA folder aren't showing up, changing the file extension from ".sft" to ".safetensors" should make them appear.
I love the lora, I use it all the time.
Did you see the newly released Turbo lora? Do you think it`s possible to extract only the single blocks from this lora as well? I wonder if its possible to reduce the size of the lora significantly to reduce VRAM usage?
I downloaded it and didn't see what's in it yet. Was interested in trying it on openflux with the distill removed.
ok. so i shaved the other stuff out of turbo.. unfortunately it's only 300mb now. I was hoping for bigger reductions. I guess I will upload it because it's better than nothing and images seem less shiny.
@Gore_Man Thank you
Anything less than 10 steps is a mess. Doesn't produce good enough images at all for me. I have tried different schedulers, but seem to land on >10 steps euler/beta. I have no idea how you manage to get good images out of 4 steps!? 🤔
try a different scheduler/sampler combo. 4 steps can be good, 8 steps is better for sure. not all sampler/schedulers are great on either schnell or dev on their own. not a big fan of euler and flux in general.
@Gore_Man What sampler/scheduler do you recommend for 10 steps? I'm getting grid artifacts on my images.
@waveh ddim, deis, lcm, then I use it with sgm_uniform, beta, ays. Some work better than others together. Like ddim/beta.
I've resolved to running the UNET model "flux1-dev.sft" with the lora "Flux-sch-singleblocks-f32.safetensors" for 5 steps and then upscaling with Ultimate SD Upscale for another 5 steps and the same model. Makes wonderful images of great quality! Consecutive generations with the same prompt takes ~13 sec for the first step and a total of ~90 sec including the upscale on my 3090
@kallamamran what sanpler and schedular ?
This is very nice. Is anybody using it with ComfyUI? I get a noise in the luminance of the image, I suspect is some interference between the CLIP (encoding text adding a sinusoidal) and the latent. I don't see this problem in the images using it, but I can't find any of them using a ComfyUI workflow
non edited images have a WF in the metadata
Thanks for making this! I've been using hyper and switched to this for anime stuff after reading about it on reddit. Its nice to get a little schnell style that I was missing in dev. I'm happy with 12 step so far (but I'm picky). The only thing thats been bumming me out about these (hyper/turbo/yours) is banding/artifacts on gradients.
this is a gift to mankind.
this doesn't work with me :
model dev , using 4 loras , should it differ where to put dev2schnell lora at end or first after load checkpoint ? guidance 3 , what's wrong ? i', testing with checkpoint called project0real1sm v3 dev
it seems working ! but at least 4 step needed and still artifacts , just pick proper sampler combination with whatever lora used .. i wondering how this lora make dev create images with 4steps , what disadvantages now it's schnell ? or better ? better at what ? i can't know by try cuz i'm on CPU haven't GPU , it's 30min for 1 image to me here ..
I really like this LoRA, it has greatly improved my efficiency, and I've been using it almost since I first started with FLUX.
When I was testing my LoRAs, I found that perhaps because only single_block were extracted, it did not significantly affect LoRAs for things like faces and body shapes. However, for LoRAs with a real-world style, scenery, or detailed elements, artifacts are very easily to appear. This is because these types of LoRAs sometimes are more influenced by double_block training(but when using only the base model and prompts without LoRA, artifacts are not as prevalent). I have seen that other people's solution is to use some photorealistic LoRAs that enhance details and reduce artifacts.
Can you still use Flux Dev loras with this?
Would this work on GGUF models?





