Humans - CivArchive (CivitAI Archive)

Humans - v1.0

This model is designed to produce photo realistic images of normal people. Most SD models can only produce beautiful people. This is not that. You will get acne, moles, ratty hair, crooked teeth, wrinkles, and well, ordinary people.

The short version:

There are thousands of trigger words which can be found at https://gist.github.com/jaretburkett/cf8c224243834172fc13f72aaf49811d , or for a sorted list based on frequency, see here https://gist.github.com/jaretburkett/41370fdf69b791d2b406f3fa538d4b32 . The big one to know is the word “face”. A significant portion of the dataset has faces, and they were all tagged with face. Use it to get faces, without it you will get farther away shots, usually portraits. The model will do well with simple as well as much more complex prompts than normal SD models can handle. It will generate massive variations of people from seed to seed, even with the same prompt. Trained on bucket sizes of [328, 512, 640, 768, 896] at various aspect ratios, and should be able to produce images at those sizes without any hi-res fixes.

The long version:

The Dataset: I have been building this dataset for around a decade. It has around 100k (and growing) carefully curated, balanced, and labeled images with the goal to remove the bias in generative AI models. It was built and added to over the years for various products I have created, and I figured it would be good to throw Stable Diffusion at it. The dataset is designed to have mostly normal people though there are some beautiful people in it as well. I always tried to keep it as balanced with the general population as I could, which is hopefully apparent from the images this model can generate. There are a lot of faces in the dataset, and they are labeled with the key word “face” to help trigger or not trigger a close up of a face. Around half of the dataset if faces only, I am working on balancing this with more portraits, headshots, full body shots for version 2.

Labeling: Labeling was done partially by hand over the years, but mostly by BLIP2 more recently. I created a custom key word list for people’s photos that I use for the tagging library in addition to the standard BLIP2 captions. You can find this keyword list here https://gist.github.com/jaretburkett/cf8c224243834172fc13f72aaf49811d . It is mostly made with the help of GPT-4, and I plan to manually prune and improve this for version 2. I also plan to release my tagging code soon, but those familiar with custom interrogators can probably put this to use, if you want. The main purpose of the labeling process is to be thorough in describing people. Most SD models will do little more than old, young, man, woman, hair color and maybe race. I wanted to be able to do specifics with nose shapes, cheekbone depth, complexion, national origin, eye shape, hair styles, and very nuanced specifics, and so far I am very pleased with the results. The model now knows subtle details of the human face. This should aid in creating embeddings (textual inversions) as the model will know how to create these unique features of a face, they just need to be triggered by the embedding.

What is Next: This is version 1, and really an alpha version. I am still working on it and hope that version 2 will be mind blowing. I am already training it, and improving the dataset. Currently, this one is not perfect with some details. Eyes can get wonkey, and so can teeth, more than intended at least. It will take some time to train this out, and I plan to do just that as well as add more variety of image types of normal people.

Your Current LoRAs and embeddings: Yeah.. Your LoRA’s of beautiful people trained on models that can only create beautiful people are not going to work the same way here. You will likely get a picture of their back woods cousin instead of the intended subject, which is fun to play with. Give it a shot.

Description

Initial release

FAQ

Comments (42)

eurotakuJun 28, 2023

CivitAI

great project, very curious how v2 will turn out.

hardysJun 28, 2023· 1 reaction

CivitAI

what base was used for training against for ( SD base 1.5?) The models knowledge is a bit lacking at this stage though the results are fresh. hand and feet are missing or really limited - portraits only. for v1(alfa) quite promising start - please keep going.

ostris

Author

Jun 28, 2023

Thank you. Yeah I am working on some more full body shots and going to add framing key words to better regulate the shots for the next version.

The base model is a custom mix of some popular public models, and many of my own short run models for various traits and some custom seed lock training to do things like increase details. I have not released that yet, but may as it continues to improve.

If you are asking to extract a LoRA or LyCORIS from it, I plan to release both extracts sometime today.

psspsspsspssspssJul 5, 2023· 1 reaction

@Ostris I don't want to be a nag, but you said you were releasing lora/lycoris today ... 7 days ago :D

ostris

Author

Jul 5, 2023

@psspsspsspssspss lol I tried. Just had another request for it as well. I did a ton of extractions at various sizes and with different algorithms and the perfectionist in me just wasn't happy with the result. With LyCORIS extractions on dreamthooths of people I do, I see near 1 for 1 results. This has just been training on so many concepts. It loses a lot. But I will get something out today for it. It will likely be LyCORIS as those results were significantly better (as usual). I may do a LoRA as well, but probably won't link it to this model as it really just adds skin detail, which is fine.

ostris

Author

Jul 6, 2023

@psspsspsspssspss Sorry it took a bit. LyCORIS is now available -> https://civitai.com/models/103848

moesahJun 28, 2023· 2 reactions

CivitAI

I feel like this would be great for generating regularization images

sevenof9247Jun 28, 2023

CivitAI

a hint, i used it to train a face but it is not so good at the moment

ostris

Author

Jun 28, 2023

I would love to know what all you tried, I have not had the time to test various training methods with it yet. My guess would be that it would perform far worse for "known" faces than other models, especially for textual inversion. most models can get pretty close for most celebrities already, so it doesn't take much to tweak that knowledge slightly. However, Training TIs for unknown people, is a different story. Someone's fat balding uncle with missing teeth is extremely difficult to train on most models as it doesn't know how to generate people like him. My guess is that training a TI in that case would be easier with this model, though, that is purely a guess. I plan to test soon.

sevenof9247Jun 29, 2023

al right, all running in darkness here ;)

i have tried several models such as icantbelive, epicrealism, cyberrealism and do training with all of them as described in my training guide (article) here. and so far your model always produces blurred images, but only during leraning and with the finished lora. as a normal model everything is okay ...

GairmJun 29, 2023

@sevenof9247 this is good to know, as I do some face training on different models as well. Also, you didn't link whatever article you tried to link, would love to give that a read through :)

sevenof9247Jun 29, 2023

@Gairm https://civitai.com/user/sevenof9247/articles, give me more hints if you have ;)

_Qing_Jun 28, 2023

CivitAI

哇！看起来真厉害，实在辛苦啦。希望下版本能成功解决大头照过多的问题，也希望作者能够添加各种类型、地区的人

piratekittyJun 28, 2023· 3 reactions

CivitAI

I'm a bit sad that when I clicked this model, it wasn't actually a model based on Quark from DS9.

psspsspsspssspssJul 1, 2023· 1 reaction

Well, that model would be called "hu-mons" ...

CyclopsGERJun 30, 2023· 2 reactions

CivitAI

This model is fantastic!

7727Jul 2, 2023· 6 reactions

CivitAI

This dataset is amazinggggg. I've been working on a model for a long time, merged it with this one and holy s**t.

Really hope you upload more!

1928587Jul 5, 2023· 4 reactions

CivitAI

a nice and brave approach towards those - crappy illustration and redigested merges-, thanks you

biggerthanbigJul 5, 2023· 5 reactions

CivitAI

Awesome, it is great to see some regular people instead of 'photoshop' beauty. I'm a noob AI wise, but would it be possible to make a Lora/LyCORIS out of this to add on top of existing models? To pretty much filter out the 'beauty'of those checkpoints?

ostris

Author

Jul 5, 2023

Yes and no. I have have extracted LoRAs and LyCORISs from it trying to do this and they do some neat stuff on other models, but not that desired effect. It mostly turns them into portraits and adds more skin texture, which is fine, but it does not "normalize" the people much. The issue is that extractions work great on small concepts and people when trained on 40 images for a few steps. This has been trained for hundreds of thousands of steps on a dataset around 100k images. It is a lot more data to extract that gets lost.

That being said, I'll try to get one pushed out today. It will just take a little more testing

biggerthanbigJul 5, 2023

@Ostris Thanks for the explanation. Makes sense seeing how a lora is only the fraction of the size of model. I might be thinking way too easy about this, something along the line of Zovya's age and gender slider embeddings just for regular vs. supermodel people. I only dabble with SD, so I'll leave the technical stuff to the wonderful content creators like you. Looking forward to try out the extraction. ;)

ostris

Author

Jul 6, 2023· 1 reaction

Just published a LyCORIS of this -> https://civitai.com/models/103848

7727Jul 6, 2023· 1 reaction

CivitAI

Okay so I've been searching everywhere and can't find anything and hoped you might be able to provide insight!
How did you generate those trigger lists? Is there a way to extract that data from other models using the safetensor file?

Asking because your token counts list is a phenomenal resource.

ostris

Author

Jul 6, 2023

Thank you!. Extracting existing triggers would be hard, probably not even be feasible. You would have to brute force dump random combinations of phrases into a text encoder and see what is "lighting up" on the output. Even then, it wouldn't be accurate. I know there are actual word lists you can get with the entire tokenizer vocabulary for the base models (2.1, 1.5, etc) but those are single words. If you dig around the diffusers models repos on hugging face, you can find some.

For this list, Some of it I added manually but most of it was generated by fighting with Chat GPT-4. I have so many chats going on there and get some from all of them,, but a good amount of the tokens came from this single conversation -> https://chat.openai.com/share/fe68ed48-d54d-4bc1-9495-ee088b953998 . It will at least give you some idea of the "prompts" I am using to get it to make them. I even have a request for a bash command in there to sort and remove duplicates. From there, you feed this into an interrogator like https://github.com/pharmapsychotic/clip-interrogator . I do a caption plus top 32 tokens from the list per image and then drop out a ton of them on training for variety.

wiizJul 6, 2023

CivitAI

Hey, you mentioned you have around 100k images, but the steps are 300k which means 100k steps per epoch.

What was your batch size and steps per image, and how long did the training take? I am only asking since I'm looking train on a similar number of images but for animals and there aren't many people who've fine-tuned on 10k+ images, so this info is very valuable (not so much the batch size).

ostris

Author

Jul 6, 2023· 1 reaction

Oh, I manually put that in, was in a hurry so did stupid math. I was doing a batch size of 6 most of the time, so I suppose I am actually around epoch 18, though I was doing weird stuff with the datasets where I would deactivate certain ones for a while if I noticed it was developing a bias, so I don't know the real numbers, but the step count was relatively accurate for this tuning.

Training time on this one is also weird because 1. I go days without training on it as I am also training a lot of other things that often took priority. 2. I was working on my training code and had some bad runs from some experiments I was doing with it. 3. My main gpu was bad but only intermittently crashed so I had a lot of interruptions and lost training.

That being said, to get to this level of training total, I would estimate around 72 hours total on a 3090 ti. If you are doing animals, you could probably get by with much higher learning rates and less bucket sizes than I was doing. I was going low and slow because I was doing human faces.

What is your project for? with a large dataset, I assume you are not just teaching it to make dogs. Are you trying to be able to replicate people's pets and want a model with a good enough understanding to do it?

sqarkleming228Jul 31, 2023

@Ostris Hi, the model you trained is very fantastic, i am doing the similar job to fine tune a specific style model on 100k~200k images. But i am not satisfy with the faces genenrated by my model even though i have tried many fine tune parameters, can you share the learning rate and learning rate scheduler you use? Best wishes for you

ktiseos_nyxJul 7, 2023

CivitAI

NOW this i can stand behind <3

magnusnielsenJul 7, 2023

CivitAI

Amazing a model like this was already necessary, without having to put the word ugly in the prompt, or the word beautiful in the negative prompt. I have been using it and I can notice the great results in the closeups, but not in photographs where the face is not in the foreground, even with AfterDetailed I could not fix the face and hands. It also sometimes generates images that look like poorly done photoshopped photomontages. I hope you continue to improve this model because it really is great, I congratulate you and thank you for the work behind it.

crnclashers868Jul 7, 2023· 2 reactions

CivitAI

hey,
i have 160k images from unsplash and pexels, they are very high quality, please share your discord so we can discuss it further
mine: qwerty_qwer

antonio_invernessJul 7, 2023· 2 reactions

CivitAI

This is SO GOOD! I love this model. While it's true that my current "supermodel"-trained embeddings don't perform the same way in this model, they nevertheless provide a beautiful alternative view of the same characters. For me, I'd say they render the character the way an actor would look between film roles, when they've put on a little weight and can't be bothered with makeup. They're recognizable, but look more "normal".

achttag477Jul 12, 2023

CivitAI

Unconventionally beautiful, at last. Thank you

DanseMacabreJesterJul 24, 2023· 1 reaction

CivitAI

Many thanks for the model!! I hate to trick out to get usual peple faces that not barbies and kens.

Art3mis_CivitaiAug 1, 2023· 1 reaction

CivitAI

This looks great, but seems incomparable with prompts that create specific camera looks.

I either get completely malformed faces or eyes and teeth that looks like something straight out of a horror movie.

I know it's an early version so I can understand that, but I was wondering if you have any tips on prompts to prevent this?

vivalabadilaAug 15, 2023

My results have nothing to share with examples too, with same prompts and settings.

HitManLeeAug 6, 2023· 2 reactions

CivitAI

And you, are my true hero, how valuable and meaningful all this work of yours is, the community will be stronger. Looking forward to your even greater work in SDXL.

polisonicoSep 8, 2023

CivitAI

any plans to make a SDXL version?

ostris

Author

Sep 8, 2023

I have been training an sdxl version off and on since the day 1.0 it came out. I am not happy with it yet. SDXL is so far from being able to make realistic looking photos as it is, fine tuning it to make something like this is just going to take a lot of work. The current progress I have is better than any realistic models I have seen released for it so far, but still no where near the realism of the average 1.5 model. It will get there eventually. I have another GPU arriving in a few days that should allow me to keep one focused on fine tuning xl while I use my other for my research.

JanetFeb 3, 2024

@ostris any progress on this? You could train over a merge of some of the newer models, juggernaut8 / hellowWorld4, Think Diffusion. I imagine training a humans over those would be amazing.

magnusnielsenSep 28, 2023· 7 reactions

CivitAI

Hello, Ostris. Humans is one of the models that I use the most to generate faces portatils or impainting faces, I have read that you are planning to release a second version, is there any news about it? And I agree that SDXL still doesn't achieve the realism that SD1.5 is capable of. Thank you!

JanetFeb 3, 2024· 2 reactions

CivitAI

Still one of the best, even with the newer SDXL models.

Would sooo love if you trained SDXL on the same data (or if you can't, share your dataset and I'll train it for you)!!

JanetFeb 19, 2024

CivitAI

Cascade is out now!!! Please please please train using this exact same dataset on cascade!! If you're not able to, I could do it for you. This was an incredible model!!

Checkpoint

SD 1.5

by ostris

Download (Beta) View on CivitAI