Nepotism - V2 [DiT]

NSFW

Nepotism • XII

The pinnacle of Flux evolution. Trained on 8.5 million images, over 124 epochs, and more than 2.1 million steps, Nepotism XII doesn’t just improve— it redefines what’s possible with Flux.

🔥 What’s New in XII

Massive-scale training across a vast, diverse dataset—every style and nuance captured.
Precision and polish leveled up: textures, lighting, composition—all sharper, richer, and more lifelike.
Unmatched prompt fidelity: higher style compliance and nuanced interpretation—complex (and simple) prompts are no match.
Style spectrum master: effortlessly handles photorealism, anime, stylized art, abstraction, and hybrids—no overshoot, just precision following your intent.
Noise-free clarity: only minimal to moderate artifacts on highly intricate scenes and edge case styles/concepts—noise is gone, detail reigns.
Stable as lightning: performance optimized for fast, consistent iteration—even on mid-range GPUs.

🚀 Why XII Crushes It

Ultra-deep training foundation means bigger learning volume → richer representation → more reliable outputs.
Next-gen DiT architecture refined to perfection—usability reaches new heights.
LoRA and CLIP synergy: ready for prompt tuning with minimal weight adjustments—compatible with all your favorite fine-tuned workflows.
Practical speed on real rigs: 20–32 steps in 15–20 s on a 4080, delivering near studio-grade results in under a minute per image.

⚙️ Recommended Setup

Steps: 20–32 (8–12 steps work too, but sacrifices some detail).
FLuxGuidance: 2-4.5 (lower=more abstract, higher=more on the rails. I use 2.8 & 4.5)
LoRA Strategy: Start with vanilla; dial in low LoRA weights for precision tuning.
T5‑XXL: Use the Flan T5‑XXL for top contextual understanding.
CLIP L: A long-context clip L is essential. I recommend LongCLIP-GmP-ViT-L-14

📊 Performance Snapshot (4080 GPU)

Cold load (no LoRA): ~1.0–1.1 s/it
With LoRA (warm): ~1.0–1.3 s/it
With LoRA (cold): ~2.0–3.5 s/it, quickly dropping after warm-up

🎯 Ideal For

Content creators with mid-tier GPUs chasing FP16-level results
Artists and developers seeking broad style versatility and prompt fidelity
Workflows tight on time but unwilling to compromise on image quality

Your best outputs fuel my motivation for this project. Upload, show off, and help me make the next one even better!

(also accepting dataset donations, dm for requirements)

BONUS TOOLS:

Tenos Discord Generation Bot: An image generation bot that uses Comfy's API and Discord's API in a workflow format that focuses on creation over configuration.
Flux Prompt Crafter GPT: Crafts highly imaginative and visually detailed Flux prompts.
Bobs Latent Optimizer for ComfyUI: This custom node for ComfyUI is designed to optimize latent generation for use with FLUX, SDXL, and SD3 modes. It provides flexible control over aspect ratios, megapixel sizes, and upscale factors, allowing users to dynamically create latents that fit specific tiling and resolution needs.
Bobs LoRA Loader for ComfyUI: A custom LoRA loader node for ComfyUI with advanced block-weighting controls for both SDXL and FLUX models. Features presets for common use-cases like 'Character' and 'Style', and a 'Custom' mode for fine-grained control over individual model blocks.

Description

merge of dev and schnell, save to your UNET folder NOT A CHECKPOINT

FAQ

Comments (15)

jr81Aug 7, 2024· 3 reactions

CivitAI

Can you add more examples, and comparisons, with generation data ?

I think this is a good idea, in principle.
But I've only seen images with 20 steps, or no generation data, here so far.
I was able to get good images with only 12 steps, with the [dev] model, in Comfy UI.
For this to be worth it it needs to do good images with at least 6 to 8 steps.

BobsBlazed

Author

Aug 7, 2024

I'm at work RN but when I get home I got you [edit: late night, i will do this tomorrow- I am eepy]

pychobj2001741Aug 9, 2024

if you are working with 6/8 steps you might be on the schnell one, this is a mix of dev and schnell but I think it leans more dev hence the higher step count. I use this this as a unet, i load the DIT v2 as a unet, tell it's its fp8 unet and also uses the v2 clip l and t5xxl fp8 version and w/ 25 steps ddim/uniform dimm I get imo above dev images! I want to find a find tuned t5v1.1 fp8 em5 model but I haven't had any luck yet. I know City69 made a t5 finetune but I can't find it! grrr..or I have it but can't use it right.

pychobj2001741Aug 9, 2024

Imma try this https://huggingface.co/google/t5-v1_1-xxl/tree/main I know I wanted fp8 but meh, I got 16gb vram, which I knew how to put in in regular ram....is there a way?

TheGeekyGhostAug 9, 2024

hmm, I get good images from Schnell at 2 steps using uni_pc, and Good at 6 steps with eula on base schnell model.

tazztoneAug 10, 2024· 1 reaction

CivitAI

for 24GB VRAM it would be best to mix flux fp8 with t5xxl fp16 right?

BobsBlazed

Author

Aug 11, 2024

one of the benefits of this model merge is that there are very little differences when you use it at fp16 vs fp8, you def can afford to by how it sounds, but I dont know if you'll find better results that make the added wait time worth it, in my testing the fp8 was way better bc i could make an image every 30 sec vs every 90+ seconds and the results at same seed were very nearly identical.

aiPropsAug 11, 2024· 2 reactions

CivitAI

Apologies for asking for so much, I'm trying to make this as flexible as possible and running into issues. Can the DiT model be used like a UNET safetensor? The results I get trying that are trash (see FUX Full Flexibility - Fully Flexible FUX | Stable Diffusion Workflows | Civitai). What am I doing wrong?

BobsBlazed

Author

Aug 12, 2024

no worries mate, yeah the DiT model should be going into the UNET folder, only the AIO versions go into checkpoints

BobsBlazed

Author

Aug 12, 2024

also you should use 1024x1024 - 512 will not work as well with flux

aiPropsAug 12, 2024

@BobsBlazed Got it. It works only if I use one of the dimensions that you had originally baked into the drop down. Anything other than those and I get garbage. For example, I get really good results for 1536x640, but poor output at 1536x768. Guess that's one of the limitations of FLUX?

BobsBlazed

Author

Aug 12, 2024· 1 reaction

@aiProps its mostly to do with the training data they used and less to do with the merge specifically, though im sure that did no favors in this regard, many of the larger datasets that all these big guys use are a limited range of aspect ratios, in all of my my workflows i use rgthree's SDXL aspect ratios, dude knows his stuff

BobsBlazed

Author

Aug 12, 2024· 1 reaction

if youre ever feeling stuck feel free to drop any of my images into comfy, anything I upload to civit all have the workflow attached

J1BAug 12, 2024· 2 reactions

CivitAI

So can you run this in 4 Steps? or does it take more?

BobsBlazed

Author

Aug 12, 2024

A minimum of 10 steps but all the results below are 20 steps and took about 30sec to generate on a 4080

Checkpoint