I was trying to train my Au Ra in FF14, by providing screenshot from the game. Its can output the horn and scales right, but the 3D style is affecting the generation. Anyone have a idea of how to fix it?
These are some of the setting
pretrained_model_name_or_path = "./sd-models/latest.ckpt"
resolution = "??0,??0"
enable_bucket = true
min_bucket_reso = 256
max_bucket_reso = 1024
save_every_n_epochs = 1
max_train_epochs = 10
train_batch_size = 2
network_train_unet_only = false
network_train_text_encoder_only = false
learning_rate = 0.0001
unet_lr = 3.5e-5
text_encoder_lr = 3e-6
lr_scheduler = "cosine_with_restarts"
optimizer_type = "Lion"
lr_scheduler_num_cycles = 1
network_module = "networks.lora"
network_dim = 32
network_alpha = 32
shuffle_caption = true
weighted_captions = false
keep_tokens = 1
max_token_length = 255
seed = 1337
prior_loss_weight = 1
clip_skip = 2
gradient_checkpointing = false
mixed_precision = "fp16"
save_precision = "fp16"
xformers = true
lowram = false
cache_latents = true
cache_latents_to_disk = false
persistent_data_loader_workers = true
Description
I fixed the repeat rate of body Data to 6, and 10 for head data.
And use AOM3 as base model.
Can output good horn and scale. but the 3D feel still kinda affecting the output.
Also cut the trigger word to Only Ares.
please put covered_ears to prevent the human ears to pop-out
the hand still a issue cuz the in-game hand is large.
Details
Files
Available On (6 platforms)
Same model published on other platforms. May have additional downloads or version variants.





