This is a ControlNet model! This model requires ControlNet!
Model Details
This model aims to allow users to modify what a subject is wearing in a given image while keeping the subject, background and pose consistent.
I've produced good results in txt2img, img2img as well as inpainting.
I've produced good results with images generated with Stable Diffusion as well as with pictures I've taken myself.
Installation
Place the .safetensors file into ControlNet's 'models' directory. To use the model, select the 'outfitToOutfit' model under ControlNet Model with 'none' selected under Preprocessor.
Tips for use
Images with a clearly defined subject tend to work better.
This model tends to work best at lower resolutions (close to 512px)
If you run into trouble at higher resolutions, try running a first pass at a lower resolution and then using img2img (or txt2img w/ Hires.fix) with a lower denoising strength to upscale to a higher resolution while continuing to use your original input image as input to this ControlNet model
I recommend starting with CFG 2 or 3 when using ControlNet weight 1
Higher CFG values when combined with high ControlNet weight can lead to burnt looking images.
Experiment with ControlNet Control Weights 0.4, 0.45, 0.5, 0.6, 0.8 and 1.
Lower weight allows for more changes, higher weight tries to keep the output similar to the input
Anything below 0.5 seems to rely more on the Stable Diffusion model whereas anything 0.5 and up seems to weight the ControlNet model more heavily
When using img2img or inpainting, I recommend starting with 1 denoising strength
Experiment with 0.75 denoising strength
When inpainting, I recommend trying "latent nothing" under Masked content
Consider lowering the model's weight when generating higher resolution images
The higher the resolution of the output image, the more difficult it tends to be to alter the content of the image from the input image
If the output isn't changing enough from the input, try increasing the weight of the prompts or decreasing the Control Weight of the ControlNet Unit
Can work well with other models such as OpenPose ControlNet
Description
FAQ
Comments (29)
Let me know what you think!
I've been working on this concept for a while and this is the first version where I felt really good about the results and thought it would be worth sharing.
Great!
I think you're a genius!
What dataset did you train on? I can't understand how it's done )
@Sa_May Thank you so much! I actually created the dataset myself using Stable Diffusion, Segment Anything and other ControlNet models using some custom scripts and fine tuned settings.
I'm considering posting everything along with my process at some point, but I may want to focus on creating more first
@EmmyJ_ How long did it take you to put together the dataset? How many images? Did you work for a long time?
@Sa_May Developing the process took the longest amount of time. I've been working on this project for many weeks. However, once I figured everything out, I was able to generate the dataset and curate it over the course of maybe 4-5 days and the training itself took a couple days running on my 4090. Much of the this time was just letting it run after I set it up and I was able to curate the data over a few hours while I was listening to podcasts or watching TV.
The dataset I used to train this model consists of 10,156 images which worked out to 37,840 samples.
Each sample consists of a pair of images and a prompt. I generated the images in sets of 5 which gave me about 20 samples for each set (fewer in cases where there were poor quality images I needed to delete)
@EmmyJ_ Thank you! This is really very interesting! I'd be happy to chat with you more on this topic )
Do we need that "ClothingAdjuster" Lora that you are using in all the images as well?
It's not required, and I've gotten many good results without it, but I've found that it can be useful in some circumstances (e.g. full costumes), so I thought I'd mention it
Which version of controlnet is required?
Tbh, I'm not sure what the minimum version requirement of ControlNet for this model would be, as I didn't think to test that, but I've been using the latest version of the 'sd-webui-controlnet' Extension for Automatic1111. The current version is '05ef0b1c'.
Seems to work on my slightly outdated 1.1.410, so all good, thanks.
If I may say so, I find the name a bit misleading - expected it to transfer outfits (ala pose or depth) rather than everything else and prompt for outfits. It still is a handy tool, just that sometimes describing the details accurately can be difficult or laborious, whilst for pose or background there are arguably some options already.
@firemanbrakeneck Thanks for the feedback! I'll give it some more thought
Works great!
This is really useful, thanks!
Some issues to report:
The composition in the init influences the output, with any guess mode. Even with another depth controlnet enabled, parts of it still make it into the final image.
Color is not really factored in, the output might have it by coincidence, but more likely it bleeds over the rest of the image, or is another color entirely, independent of prompt.
Thanks for the feedback! I'll look into using the model with another depth ControlNet when testing future versions. If you'd like, you can show me an example image (or images) and describe what you're trying to get it to do.
As for color, I've found it can be tough to get it right. Even a prompt like "white shirt and black pants" seems to confuse Stable Diffusion most of the time, but I'll see what I can do there
This works surprisingly well as a stabilizer for temporal coherency. When used together with an openpose or lineart controlnet, this makes for a nice video generation tool
First of all thank you for your work. I'm setting up controlnet as explained and using the parameters you suggest but the subject keeps changing no matter what i do... I'm trying to change the outfit of a real photo using Img2Img. Any help appreciated
If your subject is changing too much you can try:
- Using this model in combination with inpainting
- Raising Control Weight
- Lowering CFG
- Lowering Denoising Strength
- Lowering prompt weights
Unfortunately, some pictures just work better than others and img2img with real pictures can sometimes require a lot of tweaking. Working on making improvements for future versions. If you'd like to provide an example image that's similar to the one you're working with I can give more suggestions
after updating to latest controlnet I keep getting the message that models version can't be identified, model still works but thought I'd mention it.
Thanks for the heads up. Not sure what that might be about
The clothing change effect is really impressive, especially when combined with inpainting. Could I ask how you ensure that the background and the person remain unchanged? Do you mask the clothes during the training process? Thanks very much!
One of the first models I trained involved manually inpainting images to create a small dataset. The model worked decently well but was more limited and required more tweaking to get any decent results. However, I was able to use that first model to help create the dataset for the model you see here. For the dataset I used to train v1.0 of this model I would generate an image, mask out the subject (man or woman) using Segment Anything, and then inpaint using img2img, the mask of the subject and the first model I trained. I was able to code it to run automatically and generated ~14,000 1024x680 images. I then went through and deleted any images that looked bad, resized them to 512x340 and trained v1.0.
Since then I've been exploring improving the dataset and training process and may end up training a v1.1 before creating a whole new dataset for v2.0.
Great, useful control net for animations. It actually works as well as or better than depth controls. Well done.
Can your workflow produce tutorials? It's too complicated.
Do you have any ComfyUI workflow for images?
At this time I do not. I imagine you could start with another workflow that uses ControlNet and then swap the model being used. Keep in mind, other ControlNet models like Depth have a preprocessing step where it uses something like midas to get the depth before sending it to ControlNet, this model doesn't have a preprocessing step so my best guess would be that in Comfy you would just bypass or delete that part of the workflow and use a raw input image.
It just works with ConfyUI. Just add a controlnet loader and give the input image to it
Just want to say that this controlnet works flawless in ComfyUI; however in my experience the prompt "the same man/woman" generates awful results, so I simply repeat the original prompt and inserting "wearing ...".
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.















