This checkpoint was trained on 3.3m images of normal to hyper sized anime characters. It focus mainly on breasts/ass/belly/thighs, but now handles more general tag topics as well. It's about 50%/50% anime and furry images as of v8. See change log article below for more version details, and future plans.
Note: This will be my final SD1x model. I wanted to see what the hyperfusion dataset was really capable of on sd1.5. So I let it train on 2x3090s for 10 months to squeeze every bit of concept knowledge out of it. This is the best concept model Ive trained so far, but it still has the usual SD1x jankiness. I probably kept the Text Encoder LR too high for too long (0.5x -> 0.3x).
Big shoutout to stuffer.ai for letting me host my model on their site to gather feedback. It was critical for resolving issues with the model early on, and a great way to see what needed improvement over time.
V9 is a v_pred model, so you will need to use the YAML file in A1111, or the vpred node in Comfy along with cfg_rescale=0.6-0.8 in both. A1111 will also need the CFG_Rescale extension installed.
I posted one old example using ComfyUI workflow here https://civarchive.com/images/64978187
Other links:
The OG hyperfusion LoRAs can be found here https://civarchive.com/models/16928
Also a back up HuggingFace link for these models
Uploaded 1.4 million custom tags used in hyperfusion here for integrating into your own datasets
Changelog Article Link
Recommendations for v9_vpred finetune:
sampler: Anything that is not a Karras sampler. Don't use Karras! Training with --zero_terninal_snr makes that sampler problematic. Also you will need to use the uniform scheduler in A1111, or "simple,normal" in Comfy at least
negative: I tested each of these tags separately to make sure they had a positive effect:
worst quality, low rating, signature, artist name, artist logo, logo, unfinished, jpeg artifacts, artwork \(traditional\), sketch, horror, mutant, flat color, simple shading
positive: "best quality, high rating" for the base style I trained into this model, more details in Training Data docs
cfg: 7-9
cfg_rescale: 0.6-0.8 rescale_cfg is required for this v_pred model. lower values tend to have less body horror, but darker images.
resolution: 768-1024 (closer to 896 for less body horror)
clip skip: 2
zero_terminal_snr: Enabled
styling: You will want to choose a style first. The default style is pretty meh. Try the new artist tags included in v8+, all tags can be found in the tags.csv by searching for "(artist)". See example images for art styles.
Lora/TI: loras trained on other models will not work with this model, even loras trained on other v_pred models are not guaranteed to work here.
Recommendations for v8 LoRA:
sampler: Anything that is not a Karras sampler. Don't use Karras! Training with --zero_terninal_snr makes that sampler problematic.
Lora/TI: If you are using LoRA's/TI's trained on NovelAI based models, they might do more harm than good. Try without them first.
negative: low rating, lowres, text, signature, watermark, username, blurry, transparent background, ugly, sketch, unfinished, artwork \(traditional\), multiple views, flat color, simple shading, unfinished, rough sketch
cfg: 8 (it needs less than LoRA hyperfusion) resolution: 768-1024 (closer to 768 for less body horror)
clip skip: 2
styling: Try the new artist tags included in v8, all tags can be found in the tags.csv by searching for "(artist)"
Tag Info (You definitely want to read the tag docs, see :Training Data)
Because hyperfusion is a conglomeration of multiple tagging schemes, I've included a tag guide in the training data download section. It will describe the way the tags work (similar to Danbooru tags), which tags the model knows best, and all my custom labeled tags.
For the most part you can use a majority of tags from Danbooru, Gelbooru, r-34, e621, related to breasts/ass/belly/thighs/nipples/vore/body_shape.
The best method I have found for tag exploration is going to one of the booru sites above and copying the tags from any image you like, and use them as a base. Because there are just too many tags trained into this model to test them all.
Tips
Because of the size and variety of this dataset, tags tend to behave differently than most NovelAI based models. Keep in mind your prompts from other models, might need to be tweaked.
If you are not getting the results you expect from a tag, find other similar tags and include those as well. I've found that this model tends to spread its knowledge of a tag around to other related tags. So including more will increase your chances of getting what you want.
Using the negative "3d" does a good job of making the image more anime like if it starts veering too much into a rendered model look.
Ass related tags have a strong preference for back shots, try a low strength ControlNet pose to correct this, or try one or more of these in the negatives "ass focus, from behind, looking back". The new "ass visible from front" tag can help too.
...more tips in tag docs
Extra
This model took me months of failures and plenty of lessons learned (hence v7)! I would eventually like to train a few more image classifiers to improve certain tags, but all future dreams for now.
As usual, I have no intention of monetizing any of my models. Enjoy the thickness!
-Tagging-
The key to tagging a large dataset is to automate it all. I started with the wd-tagger (or similar danbooru tagger) to append some common tags on top of the original tags. Eventually I added an e621 tagger too, but I generally only tag with a limited set of tags and not the entire tag list (some tags are not accurate enough). Then I trained a handful of image classifiers like breast size, breasts shape, innie/outie navel, directionality, motion lines, and about 20 others..., and let those tag for me. They not only improve on existing tags, but add completely new concepts to the dataset. Finally I converted similar tags into one single tag as described in the tag docs (I stopped doing this now. With 3m images it really doesn't matter as much).
Basically any time I find its hard to prompt for a specific thing, I throw together a new classifier, and so far the only ones that don't work well are ones that try to classify small details in the image, like signatures.
Starting in v9 I will be including ~10% captions along side the tags. These captions are generated with CogVLM.
I used this to train my image classifiers
https://github.com/huggingface/transformers/tree/main/examples/pytorch/image-classification
Ideally, I should train a multi-class-per-image classifier like the Danbooru tagger, but for now these single class-per-image classifiers work well enough.
-Software/Hardware-
The training was all done on a 3090 on Ubuntu. The software used is Kohya's trainer, since it currently has the most options to choose from.
Description
Increased image count to 1.4 million.
Included artist tags for better styling choices, search for "(artist)" in the tags csv.
This version was trained on SD 1.5, so there is no NovelAI influence in this checkpoint unlike previous versions.
More image classifiers trained, and existing classifiers improved (list of classified tags under Training Data section)
Training Notes:
~1401k images
LR 3e-6
TE_LR 2e-6
batch 8
GA 32
default Adam optimizer
scheduler: linear
base model SD1.5
No custom VAE, and none needed for inference unless you prefer one
flip aug
clip skip 2
375 token length
bucketing at 768 max 1024
bucket resolution steps 32 for more buckets
tag drop chance 0.1
tag shuffling
--min_snr_gamma 4
--ip_noise_gamma 0.02 (lower than v7)
--zero_terninal_snr
custom code to drop out 75% of tags 5% of the time to hopefully improve short tag length results
about 70 days training time (pray for my GPU)
FAQ
Comments (37)
I'm interested in the LoRA. Don't know how long that takes to make (70 days for the model, what the fuck?), but if it's too long and I'm the only one interested then it's fine if you don't do it. I'm using the LoRA from the old model and it works fine, works great even, but I'm interested to see what a LoRA based on the new, almost 3x bigger model is capable of doing.
You're definitely not the only one interested. I and plenty of other people like to use the Hyperfusion LORA with realistic models to push their boundaries and see how compatible each model is with the various concepts within Hyperfusion. So I'd also be interested in a LORA of the latest Hyperfusion model, and I'm guessing OP already has plans for that
I can upload a v8 LoRA to Huggingface if you really want to play with it. I just didn't want to put it here since so many people would just download it without reading about the compatibility difference in v8.
I've uploaded the v8 LoRA on Huggingface here if you want to try it out. Ill update the model description with that link too.
@Kodama Let me know what you think of v8. I'm still undecided if its better than v7. The tags behave quite differently, and took longer to learn certain concepts without NovelAI as the base. But at least its easier to apply styles now.
@throwawayjm Sorry, I was away from my PC for a few days and couldn't check til now. I'll try it out soon
@throwawayjm Seems like your model suffers from the same problem as the base SD1.5 model, as in there's just so much stuff in it that you have to be incredibly specific to be able to get something good to come out.
Back when I started playing with Stable Diffusion (about 3 months ago), I first tried the base SD1.5 but gave up after like half an hour, figured if I wanted to make porn it was probably better to download a model that was tailored for that instead of one that was conceived to make literally any image. I had come across your model back when I was just reading about SD, before I started using it, its theme is one of my biggest fetishes, so it I figured why not and downloaded it and oh boy, was that a good decision. img2img especially was quite impressive and very easy to use for someone who basically didn't know what he was doing, So easy in fact that it felt a bit like cheating, so I decided to stop using it and stick to txt2img so I could properly learn all the intricacies of SD (over 3 months later I can't say I've been successful, but at least I got better at prompting, lol).
After that it still took me several thousand attempts figure out how to work properly (or "properly", I still can't say I actually know what I'm doing) with v7, but at least I was able to get some "ok" pictures almost from the start with just txt2img without too much difficulty. On the other hand, even after making like 50k images on v7, I never quite managed to figure out how to make something that felt truly high quality, though I did learn how to make incredibly specific stuff, including making women with bodies that look basically the complete opposite of what Hyperfusion was conceived for (and I'd like to thank you for making that possible in your model).
So I gave up and tried a different model, AniVerse (v1.5, the latest at the time) and, well, competely effortless gorgeous waifu. It even does that if you generate with a completely empty prompt (positive and negative), lol, beautiful background included. I guess it's not surprising that it would do that, seeing as how it was conceived to be "easy and effortless to use" or whatever was written in the description back when I first downloaded it, but I'm guessing what that actually means is that the model is made almost entirely out of tons of beautiful waifus pics, along with some gorgeous muscular FUCKING MENS for some reason, and with such poor vocabulary that no matter how clueless you are about prompting, you get something nice (empty prompt? no prob, here, have a waifu). Still, I went on with AniVerse and tried to make stuff similar to what I was making on Hyperfusion but it never worked. I remember once I prompted for "shortstack" and what I got was a tall stack of macaroons along with a bunch of tall buildings in the background, what.
Eventually I remembered your model also had a LoRA version, so I got it and added it to AniVerse and there, a miracle was a born. Holy shit, man, your v7 LoRA is actually amazing. Something simple that I never got to come out quite right on base AniVerse was a regular-looking morbidly obese woman, y'know, with all the fat folds and all that. Also, getting women with unrealistically huge bottoms with just txt2img is completely impossible. But once I added your LoRA, all I had to put in the prompt was (bbw:1.5) and BAM, instant fat, giant baby woman.
Some days ago I saw you had finished Hyperfusion v8 so I tried it and from what I can tell, the problems I mentioned above got worse. Some fairly simple prompts I had used in the past to make decent enough pictures now make some weird ass stuff and even very complex prompts (150 positive, 300 negative tokens), while still able to make the overall image they were intended to, just give a worse-looking result. I probably need to review my entire prompting strategy for v8, seeing as the base model is entirely different, but I'm not sure I want to make the effort. On the other hand, I'm guessing all the expanded vocab and pics that made the model harder to use and/or worse, must have made the LoRA even better and I'm super excited to try it, which I'll do shortly. By the way, I also noticed a similar drop in quality on AniVerse v2. Seems like the guy who makes it expanded the pool of images and possibly the vocabulary to try and sort out some problems he was having in earlier versions. According to him, it worked, but it seems to to have introduced other problems such as the aforementioned drop in quality and, in my experience, getting more black and white images randomly.
Sorry for the wall of text, just thought I should share my story since I used both your model and its LoRA for so long (also used the hyper bottom heavy LoRA but I didn't like that one so much), so I thought it might give you some insight on what to do in the future. Also, thank you so much for your efforts, you are truly the GOAT for people with this kind of fetish.
TLDR: NAI based LoRA's are great because of flexibillity, but v-prediction based models are the future even if that means breaking compatibility.
@firecat6666 Yea the LoRA has always been pretty great because you are carrying the concepts of hyperfusion over to the style of the base model you want. That was the main reason for making the LoRA in the past. However with v7 I basically reached the point where I couldn't cram any more knowledge into my trained models (epsilon training and default noise scheduler can only take you so far), so I need to experiment with some new training techniques.
So v8 was an experiment on training on SD1.5 as you already know, which ended up making the LoRA difficult to use on non NovelAI models. But the overall ability to learn concepts was pretty much the same as v7 unfortunately. So the plan is to train v9 with v-prediction which has been proven to be really good at learning concepts at the cost of only being able to use the model/lora with other v-prediction models. I'm willing to take that hit for the sake of experimentation, and my v-prediction test dataset is blowing all previous models out of the water on concept knowledge so far. It includes all the new training options like 0_terminal_snr, soft_min_snr, pertubed_noise, etc. Unfortunately the full run will take a long time similar to v8 :(
As far as the finetune goes, yea the style on that is pretty bland unless you happen to be really familiar with prompting hyperfusion, but its flexible enough now with the inclusion of artist tags, you can get pretty much any style you want with the right tags. But you do have to work for it unlike the other cookie cutter models that are nice looking out of the gate.
Please extract the LoRa for this. The checkpoint model doesn't look very good on its own imo. I almost exclusively use the LoRa alongside another checkpoint model for really good results so I would love to see a LoRa.
Check the Changelog, the link for the v8 LoRA is in there, as well as an explanation as to why (which you should also read)
@throwawayjm Oh I didn't realize. Thanks
Do you use extensions like Adetailer to add more quality to your images? I dont seem to be getting good quality images even using your prompts/generation settings.
Nope, I try to keep the generations as raw as possible. Resolution and style tags, seem to have the most effect on quality, but maybe you have something configured slightly differently than I do?
Also if you don't use any style/artist tags, then the quality will look very bland for sure. The default style is boring
@throwawayjm I do copy every thing from your images, even ETA settings and ENSD, then I change in the prompt some things here and there to my liking and remove the seed to not match to your images, but also try to have a image quality similar to yours. As for the using style/artist tags, yes I use that too. but heres a thing, I guess its all about luck because sometimes I got good quality images but then I generate another image with the same settings as before and the results are different in terms of quality are alot different and worse.
I've been thinking, would it be a good idea to create some kind of negative embedding designed to stop some of the more common errors? Or would it just be a better idea to just add more data and epochs to straighten stuff out? I imagine it could help the Lora version at least.
Depends on what you mean by errors I guess. I've been focused on trying to introduce new tags to improve quality/style instead of training helper embeddings. Lots of existing embedding on Civitai should work.
@throwawayjm Well there are several goofs that are unique to this checkpoint and Lora, but yes I think you are right, its better to keep improving on what is already here.
How many artist tags does this model have and which booru should I use to see the artist tags?
I haven't counted them, but if an image had an artist tag it would be included.
As for sites, its evenly distributed between r34/gelbooru/e621/sankaku, so any of those really.
Gotta say, the model alone isn't really as finetuned in quality as I thought it would be, but merging it with other models at 0.2 does wonders.
Yea the default quality is pretty low until you get a feel for the model. You have to lean on quality/artist tags to bend it to the style you want. I'm trying to add more quality tags for the next one to make it easier. But who knows what effect it will have until its done.
I like to think of it as the SD 1.5 of hyper sized porn.
I ended up making a good mix that retains all the size tags while injecting a bit of higher quality into generations. I'll upload it soon.
would it be possible for this model to be made as a lora so that it can be used with other checkpoints?
There's a link to the LoRA in the details
@throwawayjm the 1m image version as a lora
@ATJonzie Link to it is in the changelog below
Botero would have loved this
When might we be able to get our hands on V9 epoch 11?
see my comment here
3 months later, "now" lol
What's the best sampling method and schedule type to use. i know not to use karras but also my imaages are still coming out not quite well
I usually use DMP++ 2m or 2a s. If your using A1111, make sure to set the Schedule Type to Uniform, not automatic.
thanks for the help
how do i get different styles generated. seems like all the things ive generated dont look anime enough or a weird hybrid of anime and 3d
Use the tags file in "training data" to figure out what art styles/artists are available. This version of the model was pretty well under-trained so it won't be as effective as my next one which has been training for 2x as long.
I wonder if you'll ever make a v9, possibly with SDXL? Maybe Illustrious or NoobAI even? I'd love to see where you go with this...
NoobAI_vpred probably. Just waiting on it to finalize first. But It will just be a DoRA on 600k images and not the full 3m dataset. Full finetune would take too long to train.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.



















