This is an experiment to see if I can make a TI embedding that gets the flavour of konyconi’s BohoAI LORA.
https://civitai.com/models/51966/bohoai
Thank you to @konyconi for sharing his dataset for the excellent BohoAI LORA.
https://civitai.com/models/52697/tutorial-konyconi-style-lora
The showcase uses 2 models:
revAnimated_v122.safetensors [4199bcdd14] with clip skip = 2
avalonTruvision_v2.safetensors [a4df55d292] with clip skip = 1
This TI can produce some decent Boho pix but it gets confused sometimes... eg asking for a spaceship and getting a truck. Perhaps for this sort of TI you need to use a lot more pictures in the training dataset, with more subject variation?
---------------------------
Update 09 May 2023
Continued the training to step 4000, and then 5000.
kcboho07-4000 produces a stronger Boho style.
kcboho07-5000 is stronger again but has increased duplication/repetition. e.g. more fingers, more hands, duplicate cities floating in the sky.
Tried 6000 steps but it’s even worse - overcooked.
I’ve uploaded the 4000 step version as, probably, the best result for this experiment.
Also uploaded the 5000 step version since it can produce nice results with careful object prompts.
---------------------------
I’ve been struggling to work out how to make a style TI...
what makes a good training dataset?
what training settings should I use in automatic1111?
how long to cook the TI for?
For my training dataset I copied konyconi’s 76 1024x1024 images to a new folder without the associated TXT files, and reduced them all to 512x512. Then I renamed them “01 aeroplane.png”, “02 city.png”, “03 tank.png” etc.
Why? Because I was trying to match what I’ve done in the past for TIs that ended up usable. The reduced images dataset folder is what I used in the settings below.
The wikipage for automatic1111 Textual Inversion is here:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Textual-Inversion
but way out of date. Last rev Jan 5, I’m writing this May 8.
I found this thread useful in parts. It’s a long read!
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/1528
Training model: v1-5-pruned.ckpt [e1441589a6]
I used this because I don’t know any better, and it’s been useful in the past. Should I be using a different model for training, or is the base SD15 the best thing to use? No idea.
Create embedding:
name: kcboho07
initialization text: boho style photo
number of vectors per token: 4
Train embedding:
Embedding name: kcboho07
Embedding Learning Rate: 0.001:250, 0.0005:500, 0.00075:1000, 0.001
Gradient Clipping: disabled
Batch size: 1
Dataset directory: wherever you’ve put it on your computer
Log directory: textual_inversion
Prompt template: minimum_style_2.txt
The template has 3 lines:
<<<
[name] style, [filewords]
[name] style, a photo of [filewords]
[name] style, an illustration of [filewords]
>>>
Width = Height = 512
Do not resize images: OFF
Max steps: 3000
Save image steps: 25
Save embedding steps: 25
Use PNG alpha channel: OFF
Save images with embedding in PNG chunks: ON
Read parameters from txt2img tab: OFF
Shuffle tags: OFF
Drop out tags: 0
Latent sampling method: deterministic
Training time: about 50mins per 1000 steps on a 2060/6GB.
The TI at 3000 steps does produce a Boho style, although I think it’s a bit hit-and-miss compared to the BohoAI LORA.
If anyone has suggestions about what I should be doing differently please add a comment. Or if I’m doing anything obviously stupid! :-)
Description
kcboho07-4000.pt produces a stronger Boho style. You can rename the PT file to change the trigger word if you want.
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.








