This is Pony based Photographic Model.

This model strongly follows behavior of 3x3mixXLtypeG and Yaminabe Pony.

For Dummies

Start with the sample image and modify from there.
Steps: 20
CFG scale: 5.5
Sampler: Restart
Schedule: Align Your Steps 32, or Kerras
Size: 896x1152

初心者向け

Sample画像を元に必要な部分だけ変更していきましょう。
Steps: 20
CFG scale: 5.5
Sampler: Restart
Schedule: Align Your Steps 32, or Kerras
Size: 896x1152

Versions

Version 0.1: This is proof of concept. (deprecated)

Version 0.3: Use Lycoris for training. (deprecated)

Version 0.4: Re-selecting training images and improved randoseru (deprecated)

Version 0.8: Merged with other model (deprecated)

Version 1.0: Merged with other model

Versions

Version 0.1: proof of concept. (deprecated)

Version 0.3: 学習にLycorisフォーマットを利用 (deprecated)

Version 0.4: 学習画像の再選別とランドセルの改善 (deprecated)

Version 0.8: 他のモデルとマージ (deprecated)

Version 1.0: 他のモデルとマージ

Licenses

Refer to the license of other models in the source.

Commercial use of models created by EEB is no longer available.

Licenses

元になったモデルのライセンスを確認してください。

商業利用は禁止です。

Source

This model derived from

Pony Diffusion V6 XL (training base model)
https://civarchive.com/models/257749/pony-diffusion-v6-xl
3x3mixXLtypeG v0.1
https://civarchive.com/models/567238?modelVersionId=632135
Yaminabe Pony v0.6
https://civarchive.com/models/409856/yaminabepony

0.3(3x3mixxltypeg_v01) + 0.7(yaminabepony_v006) + Lycoris made by original photo image files.

Everyone in the training photo are over 20 years old.

Source

このモデルの元になったのは

Pony Diffusion V6 XL (training base model)
https://civarchive.com/models/257749/pony-diffusion-v6-xl
3x3mixXLtypeG v0.1
https://civarchive.com/models/567238?modelVersionId=632135
Yaminabe Pony v0.6
https://civarchive.com/models/409856/yaminabepony

0.3(3x3mixxltypeg_v01) + 0.7(yaminabepony_v006) + 独自画像で生成したLycoris

学習画像の被写体はすべて撮影時に20歳以上の人物です。

This checkpoint model is outdated compare to other Pony based Photographic Model as of Aug 4th, 2024. Strongly recommended to use newer higher quality models. This model only exists for reference purposes.

~~Recommended Models for Photographic Image Generations~~

~~Yaminabe~~ ~~https://civarchive.com/models/409856/yaminabepony~~

~~3x3mixXLtypeG~~ ~~https://civarchive.com/models/567238?modelVersionId=632135~~

~~LUSTIFY~~ ~~https://civarchive.com/models/573152?modelVersionId=638929~~

~~This checkpoint model requires proper prompt adjustment to achieve photographic image. Are you up to the challenge?~~

~~This is photographic SDXL model derived from Pony, Ebara.~~

~~I'm releasing this model version 0.1, 0.3, 0.4 as feasibility study and proof of concept. I will be able to make version 1.0 if somehow I'll be able to find more time and money.~~

~~If you would like me to pursue farther on Pony-based photographic SDXL model, donation with a message can be made here:~~ ~~https://ko-fi.com/eeb_p~~ ~~I can use it to build a better model. Please bare in mind, there is no promise. I might run away with the money.~~

~~(More descriptions about proof of work will be written later.)~~

For Dummies

~~Recommended Setting~~

~~Steps: 24 for normal use, 48 for higher quality~~
~~CFG scale: 3~~
~~Sampler: Euler a~~
~~Size: 896x1152~~
~~Emphasis mode: No norm~~

~~Positive Prompt~~

score_9, score_8_up, score_7_up, best quality, masterpiece, source_anime, [photo, irl, real, realistic, ultrarealistic, photorealistic, natural skin, detailed skin:0.5]

~~Negative Prompt~~

worst quality, low quality, normal quality, messy drawing, amateur drawing, lowres, bad anatomy, bad hands, source_furry, source_pony, source_cartoon, comic, source filmmaker, 3d, blurry, cropped

Tips

Adjusting Style Factor

~~If you are getting images that has strong anime-style influence, you need to add more photographic factor.~~

If you are getting artifacts and corruption with strong photographic-style images, you need to cut down photographic factor. Simply adding and over-emphasizing photographic-style prompt is not good either.

Lora Block Weight

~~Install~~ ~~https://github.com/hako-mikan/sd-webui-lora-block-weight~~ .

~~LoRA some time has strong style affect, and you want to limit the affect.~~

~~Use Block Weight to only use necessary blocks to achieve your character/pose/clothing.~~

~~If you never used Block Weight, here's some ideas to start.~~

~~Set BASE and MID to 0.~~
~~Set BASE, IN04, IN05, IN07, and MID to 0.~~
~~Set BASE, IN04, IN05, IN07, MID, OUT03, OUT04, OUT05 to 0.~~
~~Try other combinations.~~

Prompt Editing / LoRA start, stop, step

Some prompt and LoRA gives strong anime-style. You can use prompt editing to contain them in the early steps of image generation. This will reduce the anime-style affect, but it will lose fine-granularity detail of the prompt. Read ~~https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features~~ ~~for Prompt Editing. Read~~ ~~https://github.com/hako-mikan/sd-webui-lora-block-weight~~ ~~for LoRA start, stop step.~~

[<anime_related_prompt>::0.3]
<lora:anime_related_lora:1:stop=18>

Some prompt and LoRA gives strong photo-style, but gives artifact object in the image. You can use prompt editing to contain them in the late steps of image generation. This will reduce the creation of unwanted object in the image, but will lose photographic affect in the early steps on the image generation.

[:photo:0.5]
<lora:anime_related_lora:1:start=8>

CD Tuner

~~Photorealistic Pony model tends to give whiter lighter image.~~

~~You can use CD Tuner to adjust the colors.~~

~~https://github.com/hako-mikan/sd-webui-cd-tuner~~

~~If you never used CD Tuner, here's some ideas to start.~~

~~Detail 2(d2)+1.5, saturation(sat)+5~~
~~saturation(sat)+10~~
~~Detail 2(d2)+5~~

Source

~~This model derived from~~

~~Pony Diffusion V6 XL (training base model)~~

~~https://civarchive.com/models/257749/pony-diffusion-v6-xl~~

~~Ebara (training base model and merge base model)~~

~~realPony_JPDoll (merge base model)~~

~~https://civarchive.com/models/420600?modelVersionId=468687~~

Training (v0.3)

~~Multiple photograph files of multiple peple (all age over 20 at time of photos taken) were trained for PonyDiffusion model to make Lycoris.~~

~~One training was to achieve structural features of photograph.~~

~~One training was to achieve stylistic features of photograph.~~

~~Result Lycoris from both training was merged to ebora model.~~

Result (v0.3)

~~Trained model successfully output photographic image.~~

Faces of output photograph seems to defuse all trained faces, and does not seem to　resemble any particular one person in the training set. 100 output images and all training images were sent to Google Photo. No output image were recognized as a same person in any training images.

Problem and Next Step

~~Teeth are crooked. I can try to get photos of people with better aligned teeth.~~

~~Faces are mostly same. I can try to merge more pony-derived base-model with more variation in faces.~~

~~Mouth is big. Eyes are big. I can try to get photos of people with a smaller mouth, and try to merge more pony-derived base-model with smaller mouth.~~

Some prompt and LoRA gives you strong anime-style that this model cannot turn it into photographic image. For those cases, you're on your own to come up with better prompt and control LoRA. Refer to the Tips section.

Licenses

~~Refer to the license of other models in the source.~~

~~Also refer to licensing terms and conditions on this page~~

~~For commercial use, refer to my~~ ~~profile~~.

Promptを正しく調整しないと実写調画像は出力されません。

~~Pony (Ebara) ベースの実写SDXLモデルです。~~

~~0.1も0.3も0.4もfeasibility studyおよびproof of conceptの物です。時間とお金ができれば1.0が作れるかもしれません。~~

~~寄付はhttps://ko-fi.com/eeb_p~~ ~~で受け付けていますが、寄付をいただいても1.0を作る約束はできません。~~

~~(More descriptions about proof of work will be written later.)~~

初心者向け

~~Recommended Setting~~

~~Steps: 通常なら24, 高品質向けなら48~~
~~CFG scale: 3~~
~~Sampler: Euler a~~
~~Size: 896x1152~~
~~Emphasis mode: No norm~~

~~Positive Prompt~~

score_9, score_8_up, score_7_up, best quality, masterpiece, source_anime, [photo, irl, real, realistic, ultrarealistic, photorealistic, natural skin, detailed skin:0.5]

~~Negative Prompt~~

worst quality, low quality, normal quality, messy drawing, amateur drawing, lowres, bad anatomy, bad hands, source_furry, source_pony, source_cartoon, comic, source filmmaker, 3d, blurry, cropped

Tips

画風調整

~~アニメ調の画像が出た場合は実写調のPromptを加えてください。~~

~~強めの実写調画像で破綻やアーティファクトが生成される場合は実写調のPromptを抑えてください。~~

~~単純に実写調のpromptを強調し過ぎても上手く行きません。~~

Lora Block Weight

~~https://github.com/hako-mikan/sd-webui-lora-block-weight~~ ~~をインストールしてください。~~

~~LoRAによっては強いアニメ調の画風影響を持ちます。~~

~~Block Weightを利用することで上手く意図したキャラ、ポーズ、衣装のみを出力してください。~~

~~どう設定していいか分からない場合は手始めに以下を試してみてください~~

~~BASE, MIDを0~~
~~BASE, IN04, IN05, IN07, MIDを0~~
~~BASE, IN04, IN05, IN07, MID, OUT03, OUT04, OUT05を0~~
~~その他の組み合わせ~~

Prompt Editing / LoRA start, stop, step

~~PrompやLoRAによっては強いアニメ調の画風影響を持ちます。~~

~~画像生成の序盤でのみPromptやLoRAを有効かすることで画風影響を抑えます。~~

~~その代わりに細かいディテールに対する影響は失われます。~~

~~以下の資料を読んでください。~~

~~https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features~~ ~~for Prompt Editing.~~

~~https://github.com/hako-mikan/sd-webui-lora-block-weight~~ ~~for LoRA start, stop step.~~

[<anime_related_prompt>::0.3]
<lora:anime_related_lora:1:stop=18>

~~PrompやLoRAによっては強い実写調の画風影響を持ち、アーティファクトを作ってしまいます。~~

~~画像生成の序盤でのみPromptやLoRAを有効かすることで画風影響を抑えます。~~

~~これによって意図していない物体が生成されることを防ぎます。~~

~~ただし生成序盤において実写調の影響を失います。~~

[:photo:0.5]
<lora:anime_related_lora:1:start=8>

CD Tuner

~~Pony系実写モデルの出力する画像は白く薄くなりがちです。~~

~~CD Tunerを使って調整することができます。~~

~~https://github.com/hako-mikan/sd-webui-cd-tuner~~

~~どう設定していいか分からない場合は手始めに以下を試してみてください。~~

~~Detail 2(d2)+1.5, saturation(sat)+5~~
~~saturation(sat)+10~~
~~Detail 2(d2)+5~~

Versions

~~Version 0.1: proof of concept. (deprecated)~~

~~Version 0.3: 学習にLycorisフォーマットを利用 (deprecated)~~

~~Version 0.4: 学習画像の再選別とランドセルの改善~~

~~Version 0.8: 他のモデルとマージ~~

Source

~~このモデルの元になったのは~~

~~Pony Diffusion V6 XL (training base model)~~

~~https://civarchive.com/models/257749/pony-diffusion-v6-xl~~

~~Ebara (training base model and merge base model)~~

~~realPony_JPDoll (merge base model)~~

~~https://civarchive.com/models/420600?modelVersionId=468687~~

学習(v0.3)

~~複数の被写体の写真ファイルをPonyDiffusion向けに学習してLycorisを作りました。被写体はすべて撮影時に20歳以上の人物です。~~

~~150枚のランドセルとブルマ関連の写真。4386枚の日本人女性の写真。ランドセルとブルマ関連の写真はstep数を上げています。~~

~~1つのLycorisは実写のキャラ学習を~~

~~1つのLycorisは実写調の画風学習を~~

~~学習したLycorisをebaraにマージしました。~~

結果(v0.3)

~~このモデルで実写調の画像を出力できました。~~

出力画像100枚と学習画像をすべてGoogle Photoにアップロードしましたが、出力画像の人物はどの学習画像の人物とも同一人物扱いはされませんでした。

Licenses

~~元になったモデルのライセンスを確認してください。~~

~~商業利用は~~ ~~profile~~ ~~をご参照ください。~~

作り方 (v0.1)

実写モデルや他の種類のcheckpointモデルを作る人の役に立てるか分からないけど、このcheckpointを学習した方法ついて記載します。

全てのデータが残ってないので、残ってるログから確認していますが一部間違ってることもあるかもしれません。

私の通りにすればいいというよりも、初めてモデルを作る人のとっかかりになるかと思って書いておきます。

使っているGPUは4070 12GBです。Checkpoint Finetuningではなく、LoRA学習+マージだけでモデルを作っています。

雰囲気で学習してもそれなりの結果が出るという例になれば。

v0.3では同じ設定でLoRAではなくLycorisを作りました。

学習画像の準備

学習画像は9480枚の写真。

すべて女性1人だけが写っている写真です。被写体は複数の人間です。

WD14 captioningでデータを付けて1girl,soloは抜いています。

キャラ学習

kohya_lora_gui-1.9.0.1のプリセットSDXL(PonyV6XL).xmloraをそのまま使いました。

いちおう以下コマンド

LoRA

accelerate launch --num_cpu_threads_per_process 1 sdxl_train_network.py --pretrained_model_name_or_path "C:\SDXL\ponyDiffusionV6XL_v6StartWithThisOne.safetensors" --train_data_dir "C:\SDXL\image_study\image_sets" --output_dir "C:\SDXL\image_study\LoRA_out" --network_module "networks.lora" --xformers --gradient_checkpointing --persistent_data_loader_workers --no_metadata --cache_latents --cache_latents_to_disk --max_data_loader_n_workers 1 --enable_bucket --save_model_as "safetensors" --lr_scheduler_num_cycles 4 --mixed_precision "fp16" --learning_rate 0.0001 --resolution 1024 --train_batch_size 2 --max_train_epochs 6 --network_dim 8 --network_alpha 2 --shuffle_caption --keep_tokens 1 --save_every_n_epochs 1 --optimizer_type "Lion" --lr_warmup_steps 100 --output_name "JPGIRL" --vae "C:\SDXL\sdxl_vae.safetensors" --save_precision "fp16" --lr_scheduler "cosine_with_restarts" --min_bucket_reso 512 --max_bucket_reso 2048 --caption_extension ".txt" --seed 42 --no_half_vae

Lycoris

accelerate launch --num_cpu_threads_per_process 1 sdxl_train_network.py --pretrained_model_name_or_path "C:\SDXL\ponyDiffusionV6XL_v6StartWithThisOne.safetensors" --train_data_dir "C:\SDXL\image_study\image_sets" --output_dir "C:\SDXL\image_study\LoRA_out" --network_module "lycoris.kohya" --network_args "algo=lora" --xformers --gradient_checkpointing --persistent_data_loader_workers --no_metadata --cache_latents --cache_latents_to_disk --max_data_loader_n_workers 1 --enable_bucket --save_model_as "safetensors" --lr_scheduler_num_cycles 4 --mixed_precision "fp16" --learning_rate 0.0001 --resolution 1024 --train_batch_size 2 --max_train_epochs 1 --network_dim 8 --network_alpha 2 --shuffle_caption --keep_tokens 1 --save_every_n_epochs 1 --optimizer_type "Lion" --lr_warmup_steps 100 --output_name "RBa0CharLycoris" --vae "C:\SDXL\stable-diffusion-webui-forge\models\VAE\sdxl_vae.safetensors" --save_precision "fp16" --lr_scheduler "cosine_with_restarts" --min_bucket_reso 512 --max_bucket_reso 2048 --caption_extension ".txt" --seed 42 --no_half_vae

これで自分好みの顔になるよういくつか学習画像の追加・削除をして2パターンのLoRAを作りました。

画風学習

kohya_lora_gui-1.9.0.1のプリセットSDXL画風.xmloraを少し変えました。ネットワーク次元数（DIM）が高いほど肌の質感を再現しやすいのではないかと考えました。

ネットワーク次元数：64

いちおう以下コマンド

LoRA

accelerate launch --num_cpu_threads_per_process 1 sdxl_train_network.py --pretrained_model_name_or_path "C:\SDXL\ponyDiffusionV6XL_v6StartWithThisOne.safetensors" --train_data_dir "C:\SDXL\image_study\image_sets" --output_dir "C:\SDXL\image_study\LoRA_out" --network_module "networks.lora" --network_args "conv_dim=4" "conv_alpha=1" --xformers --gradient_checkpointing --persistent_data_loader_workers --cache_latents --cache_latents_to_disk --max_data_loader_n_workers 1 --enable_bucket --save_model_as "safetensors" --lr_scheduler_num_cycles 4 --mixed_precision "fp16" --learning_rate 0.0001 --resolution 1024 --train_batch_size 1 --max_train_epochs 8 --network_dim 64 --network_alpha 3 --shuffle_caption --save_every_n_epochs 1 --optimizer_type "AdamW8bit" --lr_warmup_steps 250 --output_name "PonyPhotoA" --save_precision "fp16" --lr_scheduler "cosine_with_restarts" --min_bucket_reso 320 --max_bucket_reso 1536 --caption_extension ".txt" --seed 42 --network_train_unet_only --noise_offset 0.1

Lycoris (ebaraをベースに学習)

accelerate launch --num_cpu_threads_per_process 1 sdxl_train_network.py --pretrained_model_name_or_path "C:\SDXL\ebara_pony_1.bakedVAE.safetensors" --train_data_dir "C:\SDXL\image_study\PhotoRealistic" --output_dir "C:\SDXL\image_study\LoRA_out" --network_module "lycoris.kohya" --network_args "algo=lora" "conv_dim=4" "conv_alpha=1" --xformers --gradient_checkpointing --persistent_data_loader_workers --cache_latents --cache_latents_to_disk --max_data_loader_n_workers 1 --enable_bucket --save_model_as "safetensors" --lr_scheduler_num_cycles 4 --mixed_precision "fp16" --learning_rate 0.0001 --resolution 1024 --train_batch_size 1 --max_train_epochs 5 --network_dim 64 --network_alpha 3 --shuffle_caption --save_every_n_epochs 1 --optimizer_type "AdamW8bit" --lr_warmup_steps 250 --output_name "PonyPhotoFull" --save_precision "fp16" --lr_scheduler "cosine_with_restarts" --min_bucket_reso 320 --max_bucket_reso 1536 --caption_extension ".txt" --seed 42 --network_train_unet_only --noise_offset 0.1

これで肌の質感などを学習します。

LoRAマージ

キャラ学習でできたLoRA(JPGIRL.safetensors JPGIRL2.safetensors)と、画風学習でできたLoRA(PonyPhotoA.safetensors)をいい塩梅の配分で混ぜるWeightを探します。その結果、0.34 0.3 0.3が良さそうという結果になりました。

そこでLoRAをこの配分で混ぜてebaraにマージしました。

2024年3月現在、LoRAのマージはsd-scriptsでのみ上手く行っているので、sd-scriptsのディレクトリで以下のコマンドを実行します。

python ./networks/sdxl_merge_lora.py --save_precision fp16 --sd_model=ebara_pony_1.bakedVAE.safetensors --save_to RunBull_a0.safetensors --models JPGIRL.safetensors --ratios 0.34
python ./networks/sdxl_merge_lora.py --save_precision fp16 --sd_model=RunBull_a0.safetensors --save_to RunBull_a1.safetensors --models JPGIRL2.safetensors --ratios 0.3
python ./networks/sdxl_merge_lora.py --save_precision fp16 --sd_model=RunBull_a2.safetensors --save_to RunBull_a3.safetensors --models PonyPhotoA.safetensors --ratios 0.3

おまけCheckpoint調整 (v0.1のみ)

できあがったCheckpointではやや精細さにかけて全体が白くぼやけていたので、

SupermergerのAdjustでOUT +1しました。

OUT→Contrast→Brightnessの順番で調整するといいと思います。

VAEはsdxl_vae.safetensorsを加えておきました。

This is Pony based Photographic Model.

For Dummies

初心者向け

Versions

Versions

Licenses

Licenses

Source

Source

This checkpoint model is outdated compare to other Pony based Photographic Model as of Aug 4th, 2024. Strongly recommended to use newer higher quality models. This model only exists for reference purposes.

For Dummies

Tips

Adjusting Style Factor

Lora Block Weight

Prompt Editing / LoRA start, stop, step

CD Tuner

Source

Training (v0.3)

Result (v0.3)

Problem and Next Step

Licenses

Promptを正しく調整しないと実写調画像は出力されません。

初心者向け

Tips

画風調整

Lora Block Weight

Prompt Editing / LoRA start, stop, step

CD Tuner

Versions

Source

学習(v0.3)

結果(v0.3)

Licenses

作り方 (v0.1)

学習画像の準備

キャラ学習

画風学習

LoRAマージ

おまけCheckpoint調整 (v0.1のみ)

Description

FAQ

What is RunBullXL - Pony based Photographic Model?

How do I use RunBullXL - Pony based Photographic Model?

What should I watch out for with Pony Diffusion models?

What other Pony Diffusion-based models are worth knowing?

Can I use this model commercially?

What files are available and where can I download them?

Comments (6)

Details

Files

runbullxlPonyBased_v04.safetensors

Mirrors