Please read the relevant version info on the right side.
I think some experimentation is still needed with the generation workflow especially on version 2.0
If you get good results please share your settings.
This model was trained 4000 steps with the provided musubi training files running for about 5 hours on a 5090 and consuming around 25GB vram. I used 56x145 frame clips at 24 seconds.
These I got by running a script to splice longer videos (attached to the files if you want to use it, just put it in the folder with the long form videos and run it with the path to the output dir and you get the cuts).
I've included the training files in the training data, I used this to train: https://github.com/AkaneTendo25/musubi-tuner/tree/ltx-2
(some comments have stated that branch ltx-2-dev has more bug fixes so going to try that next).
I definitely feel it's undercooked but the loss hasn't gone down too much last 1500 steps so I won't push this run any further. Specifically I think training at rank 8 seems to be a bit too little for it to learn the motion and also 1e-4 might be a bit small. Going to try with Prodigy rank 16 for the next iteration but regardless, I absolutely love the quality of the generated image as well as the sound seems to have trained remarkably well.
The workflows for both the starting image and the i2v are my own concoction and are attached to the image / video.
If you have any suggestions on the training / workflows feel free to share.
Hope I didn't miss anything.
Peace!
Description
Changes in training methodology compared to v1:
5000 steps
rank 16
512x768 res instead of 512x512
a little bit extended dataset but dont think that's very relevant
Honestly I think the lora might be actually fine, but the video gen settings might not be the best. I've noticed that disabling the 50% scale node and generating everything directly in full res definitely works a lot better at drawing the pussy correctly.
Also did a test using the phr00t finetune of ltx and that seems to work well in i2v as long as you apply a -0.2-0.4 to the distill lora (otherwise it butchers the starting image).
You can download the phr00t version here: https://huggingface.co/Phr00t/LTX2-Rapid-Merges/resolve/main/nsfw/ltx2-phr00tmerge-nsfw-v62.safetensors?download=true
All the videos are i2v and have the workflows attached to them so you can download them and see which is which.
Going to try and do a rank 64 version of it with the same settings plus some block swap to see if that helps it learn the vagina concept better.
I think some more experimentation is required in figuring out the correct gen settings so if you find some settings that give good results do share.