Updated for 2.0 - SDXL and Anima (Flow) models. Still a bit of a mess but sections have been moved into Subgraphs and using GetNode/SetNode to cut down on the clutter. Some areas do not support the nodes so it is still long reroutes.
This is a major upgrade since Anima is a Unet model you can now toggle between a Checkpoint (SDXL/Illustrious/etc) and a Unet model. The Turbo lora is also toggled when switching models. Unfortunately, the full lora stack is not, so be careful.
The image input sections has not changed much from the 1.x series other than the Size and Orient block is now in a subgraph.
There is also extensive use of Spectrum forecasting to speed up rendering for both SDXL and Anima. A first round slow render will use a 30 step by default and each refining/detailer will use a 20 step.
Another big addition is the optional LLM node. This can be used to convert a set of tags to a natural language input or to enhance some natural language input you may have. See the notes for details. Currently using Ollama with dolphin-mistral. The detailers have been upgraded to use SAM3 for everything except NSFW detailers. For faces and people, the yolo8 and SAM3 are compared and the one with the most segments is selected and sent for processing.
On the output, you can now choose between the AuraSR block or the Upscale by Model. The save file will be automatically selected to pull the right model name (Checkpoint or UNET) for the filename.
As a sidenote, Anima seems to have better hands in general so the first pass hand detailer and second pass are typically not needed to fix any hands before running to the second hand detailer.
Most everything else from 1.x notes applies.
NOTE: Preview images done with and without detailers and LLM conversion. Also toggle between Anima and Illustrious. When toggle models the model specific General tags also swap for positive and negative prompts.
1.x series notes:
My workflow for SDXL models. It is a bit messy but functional. Has the following high level features:
Text to Image and Image to Image from multiple sources
I2I can be from random Danbooru posts, image file, image folder, URL, or video with optional autotagging.
Wildcard support
Randomized orientation and image ratio
Prompt list to create a series of images from a file
Batch or single image generation
Multiple detailers with support for detailing of background characters and faces and NSFW areas
Uses a mix of LCM and standard samplers to speed up generation
General Notes:
Default sampler is euler_ancestral or lcm for the fast render (turbo) or dpmpp_2m for everything else. This is mostly because it is deterministic so when trying out different tag prompt changes you can see the difference from the prompt and not the sampler. In normal use, you can change to whatever you prefer. For the detailers there is a single set of settings for the slow and fast sections that will change the settings for all detailers (hands, face, persone, nsfw)
Also for the person and face detailers, there is an automatic tiered approach to reduce the size of people or faces that are background figures. This cuts down on time and it also makes sure that the background characters do not become super detailed and look odd. The largest segment is found and then it is split out to segments that are 50% smaller and then 75% smaller than the largest segment. The max size for the detailer is also tiered so the smaller segments will size limited.
Additional resources:
https://huggingface.co/ai-forever/Real-ESRGAN
Description
First release
FAQ
Comments (4)
where to find the models for NSFW detailer and logo remover?
Thanks for the question. I added links to models in the suggested resources.
Been searching none stop for a I2I workflow and struggling :D, I am trying yours now (thank you btw for putting the time in and creating one).
I am struggling though to follow your instructions, I am trying to use local images on my computer, Do I reload "Input 2 - load image" and the just keep reloading nodes that are connected to it to make it work? - sorry I am sure its pretty straight forward but I am useless at this ^_^
Thanks for reaching out. I would say that this workflow is more complicated than you need if you are looking for a basic I2I workflow. My instructions are also assuming a base understanding of Comfy, so my apologies. This is just a dump of my workflow that I have been tweaking so it is not super organized like others but I find it functional with a lot of flexibility.
At a basic level, any T2I workflow can be converted to an I2I workflow by replacing the empty latent with the Load Image node and lower the denoise level on the Ksampler node. For the denoise level, the simplistic way to think about it is "How much do I want the output to be different from the input image?" So, for a denoise of 1.00 or 100% change, the output will be completely different from the input no matter what you feed it. That is what you want for a T2I but not a I2I. If you lower the denoise to say 0.9(90%), there will be drastic changes but the output will take hints from the input image. If you drop it to say 0.5 (50%), there will be some changes but the base image will be mostly maintained.
For this workflow specifically, you will need to turn off the Booru Blank Bypass so you don't send a blank latent. You enable the Booru Image Bypass group. You will likely need to bypass some of the other inputs because they will stop generation without a valid input (URL, Video, Dir path). The other option is to just give them valid inputs. As you stated, for a local image file, Input 2 is correct. Select your image and have the switch on input 2. With the group enabled it will lower the denoise level to 0.5 based on the constant float setting in the group. You can adjust as you need using that constant float node and it fill feed it into the Ksampler node. Another note, make sure you select a fixed aspect ratio and make it a portrait ratio with a width that is reasonable for your setup (832x1216, for example). This will dictate shortest dimension for your first image. (note: if you input a landscape picture the workflow will change the orientation automatically [portrait to landscape] and keep the aspect ratio of the input image scaled based on the shorter dimension). Hopefully this helps and good luck.
