Turning this into a project:
4.1.25 - added Wan T2V model - 14b
Im no expert and still learning comfy UI.. but please feel free to ask any questions I will answer to the best of my ability.
Colleen A
Age when shot: 19
Eye color: blue
Hair color: brown
Height: 160cm
Weight: 51kg
Breast size: small
Measurements: 79/56/79
Country: Russian Federation
Ethnicity: Caucasian
Description
WAN T2V - 14b - will add more samples, takes forever.
typical 512x512, 20 steps, length 48 at the lowest.
[2025-04-01 18:21:03,533] [INFO] [logging.py:107:log_dist] [Rank 0] step=1920, skipped=0, lr=[2e-05], mom=[0.0]
steps: 1920 loss: 0.0964 iter time (s): 3.341 samples/sec: 1.197
Saving model to directory epoch120
FAQ
Comments (2)
Amazing work on your Colleen A LoRA for Wan2.1 T2V 14B! I saw you achieved great results
I'd love to learn from your experience as I'm training a similar LoRA on my RTX 4090 (24GB) using Diffusion-Pipe.
Could you possibly share some specifics of your training config? I'm trying to figure out the optimal settings, especially:
What resolution did you train at?
What LoRA rank did you use (to get the ~300MB size)?
What optimizer (e.g., AdamW8bitKahan?), lr, and LR scheduler settings worked well?
Roughly how many epochs did you train for?
What did you set for gradient_accumulation_steps and num_repeats in the dataset config?
Did you need to use blocks_to_swap, and if so, how many? Was activation_checkpointing enabled?
For captions, did you use detailed descriptions or just a trigger word?
Understanding these key parameters would be a massive help. Thanks for considering!
step=1920, same data set I used for the flux model, captioned with joy captioning instead.
used this guide and template to start: https://civitai.com/articles/12330/wansdxlflux-all-in-one-lora-training-diffusion-pipe-with-auto-captioning
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.