Warrior maiden outfit - CivArchive (CivitAI Archive)

So, this LoRA is merely the result of an experiment, the long story is after this introduction but basically, it is a full outfit of a fantasy female warrior in a light dress with armor parts.

It is a rebuilt of a LoRA that is not available anymore (Callis_Armored_Dress_Illu_Edition). I generated a few pictures using it since i still had it on HDD and used those pictures for training (training data is provided for reference).

Trigger word (all included): warrior maiden outfit

The long story (TM):

This all start from a conversation with @RisingV about a technic called "B-LoRA". The main idea is that in a LoRA, not all parts are relevant and most of the "concept" and "style" live in a specific block only. That's also something proposed here for flux training.

Since i was not going to try training a LoRA using a specific tool, i wanted to see if i could "prune" an existing LoRA and keep the best part. But i needed a toy LoRA for that and an outfit LoRA is the best candidate. Instead of trying to build a new outfit idea from scratch (which could just be a good prompt), i used a LoRA not available anymore to create a few pictures and train my toy on-site. Here is the original thumbnail for this LoRA.

The dataset is 28 pictures and here is an example (all pictures where made using only New Mecha to bake the style and see if i could get rid of it afterward).

The training was done a small rank to limit the size of the LoRA, with Neural Lens Core as a base and here is the parameters:

{
  "engine": "kohya",
  "unetLR": 0.0001,
  "clipSkip": 2,
  "loraType": "lora",
  "keepTokens": 0,
  "networkDim": 16,
  "numRepeats": 4,
  "resolution": 1024,
  "lrScheduler": "cosine",
  "minSnrGamma": 0,
  "noiseOffset": 0.03,
  "targetSteps": 896,
  "enableBucket": true,
  "networkAlpha": 4,
  "optimizerType": "Prodigy",
  "textEncoderLR": 0,
  "maxTrainEpochs": 8,
  "shuffleCaption": false,
  "trainBatchSize": 1,
  "flipAugmentation": false,
  "lrSchedulerNumCycles": 1
}

I used Epoch 3 and 4 and merged them together to get my base to work (sample pictures showed already a trace of overcooking). Here is what i got with it:

It was kinda overcooked and had some issues (the face is a bit awkard, the cloth texture is "plastic", strange stuff happens with the hands and there is two swords instead of one).

Now, this LoRA use the "SGM" naming convention of the layers:

>>> [k for k in l if k.startswith("lora_unet")][0:10]
['lora_unet_input_blocks_4_1_proj_in.alpha', 'lora_unet_input_blocks_4_1_proj_in.lora_down.weight', 'lora_unet_input_blocks_4_1_proj_in.lora_up.weight', 'lora_unet_input_blocks_4_1_proj_out.alpha', 'lora_unet_input_blocks_4_1_proj_out.lora_down.weight', 'lora_unet_input_blocks_4_1_proj_out.lora_up.weight', 'lora_unet_input_blocks_4_1_transformer_blocks_0_attn1_to_k.alpha', 'lora_unet_input_blocks_4_1_transformer_blocks_0_attn1_to_k.lora_down.weight', 'lora_unet_input_blocks_4_1_transformer_blocks_0_attn1_to_k.lora_up.weight', 'lora_unet_input_blocks_4_1_transformer_blocks_0_attn1_to_out_0.alpha']

This is not ideal since most other tools using the diffuser convention and the B-LoRA code mention the up_blocks_0 (last output block if i am not mistaken). With a bit of work, i used code from diffusers to convert it to a format i could read:

>>> [k for k in l if k.startswith("lora_unet")][0:10]
['lora_unet_down_blocks_1_attentions_0_proj_in.alpha', 'lora_unet_down_blocks_1_attentions_0_proj_in.lora_down.weight', 'lora_unet_down_blocks_1_attentions_0_proj_in.lora_up.weight', 'lora_unet_down_blocks_1_attentions_0_proj_out.alpha', 'lora_unet_down_blocks_1_attentions_0_proj_out.lora_down.weight', 'lora_unet_down_blocks_1_attentions_0_proj_out.lora_up.weight', 'lora_unet_down_blocks_1_attentions_0_transformer_blocks_0_attn1_to_k.alpha', 'lora_unet_down_blocks_1_attentions_0_transformer_blocks_0_attn1_to_k.lora_down.weight', 'lora_unet_down_blocks_1_attentions_0_transformer_blocks_0_attn1_to_k.lora_up.weight', 'lora_unet_down_blocks_1_attentions_0_transformer_blocks_0_attn1_to_out_0.alpha']

From here, it was time to play Frankenstein:

Reduce the TE influence (i found out it yield better result accross models this way)
Prune all other layers except the lora_te and lora_unet_up_block_0

>>> keys = list(lora.keys())
>>> for k in keys:
...   if k.startswith("lora_te") and k.endswith("weight"):
...     lora[k] = lora[k] * math.sqrt(0.8)
...   if not k.startswith("lora_te") and not k.startswith("lora_unet_up_blocks_0"):
...     _ = lora.pop(k)
...
>>> save_file(n,"warrior_maiden_outfit.safetensors")

And here is the result:

Most of the strange stuff is gone but the LoRA is still perfectly working! It also went from 109MB to 65MB (that's normal, the number of UNET keys went from 722 to 306).

What about on other checkpoints? Here it is on cyberrealistic_catalystIllust:

With the original LoRA:

With the pruned one:

Nice, the "New Mecha" influence on the face was "corrected" (in my opinion), but you may disagree 😉

Anyway, for outfits, it is a nice test. Now, the B-LoRA code from the paper goes even further and takes only a part of the up_block_0:

BLOCKS = {
    'content': ['unet.up_blocks.0.attentions.0'],
    'style': ['unet.up_blocks.0.attentions.1'],
}

(...)

target_modules = [f'{attn}.{mat}' for mat in ["to_k", "to_q", "to_v", "to_out.0"] for attn in attns]

But i didn't go all this way, maybe some other time (especially since the paper is training a dreambooth LoRA, now, most LoRA aren't trained this way anymore) 😅

Thanks for reading! 🥰

The long story (TM):

Description

FAQ

Details

Files

Available On (1 platform)

The long story (TM):

Description

FAQ

What is Warrior maiden outfit?

How do I use Warrior maiden outfit?

Why might this LoRA not be producing the expected results?

Can I use this LoRA commercially?

Details

Files

Available On (1 platform)