Sir David Attenborough (Voice Model) for Retrieval based Voice Conversion
See Here: https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/README.en.md
You can create TTS with this app for RVC Here: https://github.com/litagin02/rvc-tts-webui
Set Index rate to 0 and protect to 0.5, use Edge TTS "en-GB-RyanNeural-Male" and Transpose to -5.
Example TTS: https://voca.ro/1cH64bMFS5dV
Example Live Speech: https://voca.ro/1cOAl5uRIEOt
Example Singing Version:
Acapella Sample of Apashe - Dies Irae https://voca.ro/15xjQ44knEac
Loli God Attenborough (Difficult Voice but was able to do it) https://voca.ro/1dgVhQTADHYy
2.0 RVC TTS Notes
This is the final version of 2.0 RVC for TTS and Singing.
Training Info
48K Hz
1000 Epoch, 12,000 steps
Pitch Guidance: False
Model architecture version v2
Fp16
316 hours of audio.
2.0 RVC Singing + TTS Notes
This is the final version of 2.0 RVC for TTS and Singing.
Training Info
48K Hz
1000 Epoch, 12,000 steps
Pitch Guidance: True
Model architecture version v2
Fp16
200 hours of audio
1.0 Notes
Training Info
40K Hz
200 Epoch
Pitch Guidance: True
Model architecture version v2
Fp16
4 hours of audio
8 hours of my time
Description
1.0 Notes
Training Info
40K Hz
200 Epoch
Pitch Guidance: True
Model architecture version v2
Fp16
4 hours of audio
8 hours of my time
FAQ
Comments (7)
Example: https://voca.ro/1iVrdRlHb9Mf
YOASOBI - Idol sung by sir david attenborough
Azumi Waki & Sir David Attenborough sing the opening to Kuma Kuma Bear
What is the current best platform being used for speech synthesis?
Home based solution is what I am using right now: https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI - Though training takes for ever but allows you to train over other's speech and replace voices, make them sing so on.
Fast and Simple is from https://beta.elevenlabs.io/speech-synthesis there is a free tier, but limited, but you can upgrade for 5$ and have more fun with it this one just allows you to tts with variations in voice , though allows you instant voice clone.
Alucard from Hellsing :)
Video tutorial (without commenting or anything, just video what and where you click) will be appreciate - that Chinese tutorial video is...crude :)
I learned from this https://youtu.be/-JcvdDErkAU
If you dont under stand that video, I will make a video for you if you really need one.
There is one part where it says "Path to Feature index file(If null, use dropdown result):" use the full path to the index file. so like C:\Downloads\trained_IVF711_Flat_nprobe_1_SirDavidAttenborough_v2.index
@cyberofficial ohhh nerdy video! I thought you were talking about weird chinese one that somehow I found by project page. Ok will follow Nerdy then :)
Its been a week since my message so I will ask - anything changed over past days? I still see your upload is only one created in that way
2.0 Coming soon!
After many hours, configurations, and retraining vocal training, I will be releasing a newer model. (I'm getting close to finishing the training, was letting it train for over 2 weeks, but posting an small update now because I'm excited for the release.)
I will have RVC & Onnx versions
New Model 2.0 Specs:
RVC
- 48K hz audio
- 1000 Epoch
- Pitch Guidance: Switchable(0/1)
- RVC Model architecture version v2
- Fp16
- 200 hours of audio
For RVC there will be 2 IDs in the model.
ID: 0 - This will be a vocal pitch guidance mode; This is the default one
ID: 1 - This will be non singing model (meant for speech.)
Decided Pitch and Non pitch will be different files, it'll be easier to manage for apps that don't have IDs implemented.
V2 48Khz 100 Epoch Speech (non singing model)
- In this audio session check point, it's perfectly working well for basic speech and real time voice speaking. This is me speaking using a RVC Voice app using my model at Checkpoint 100. It's greatly improved in later versions, but this more suited for teaser.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.