It can better capture the characteristics of male growls and correct the electronic drum pad sound in the base model.
custom_tag: Technical Death Metal/Progressive Death Metal/Symphonic Metal/Symphonic Technical Death Metal
Using the sidestep script for training, the caption length was strictly controlled to avoid truncation. The truncation length of the lyrics was modified to be consistent with the 2048 length used during inference and the lyrics were formatted strictly. In each epoch, 2% of the time, the genre tag was used to replace the caption instead of genre replacement based on a single sample. However, even with acestep V15, this could only achieve this level of improvement, and it still couldn't correct the stereotype that the presence of human voices reduces the complexity of the music.
based on ace-step SFT