You can click on the link below to try it out directly. If the effect is good, you can deploy it locally
https://www.runninghub.ai/post/1968294270253838337/?inviteCode=sdhs0trb
Fan benefits,register to get 1000 points,daily login 100 points,play 4090!Experience the super power of 48G.
https://buymeacoffee.com/a592991299o
This is a workflow for replicating human voices and emotions, which can generate emotional audio of single person speech or two person conversation. Better to use than previous models that generate stiff vocals, strongly recommended. The deployment difficulty of ComfyUI is relatively high. Firstly, the transformer version needs to be 4.51.0; Ensure the presence of the JSON5 module.
Project page: https://github.com/billwuhao/ComfyUI_IndexTTS
Model download link:
https://hf-mirror.com/nvidia/bigvgan_v2_22khz_80band_256x/tree/main
https://hf-mirror.com/funasr/campplus/tree/main
https://hf-mirror.com/IndexTeam/IndexTTS-2/tree/main
https://hf-mirror.com/amphion/MaskGCT/tree/main/semantic_codec
https://hf-mirror.com/facebook/w2v-bert-2.0/tree/main
Model placement structure:
- bigvgan_v2_22khz_80band_256x
bigvgan_generator.pt
config.json
- campplus
campplus_cn_common.bin
- IndexTTS-2
│ .gitattributes
│ bpe.model
│ config.yaml
│ feat1.pt
│ feat2.pt
│ gpt.pth
│ README.md
│ s2mel.pth
│ wav2vec2bert_stats.pt
│
└─ qwen0.6bemo4-merge
added_tokens.json
chat_template.jinja
config.json
generation_config.json
merges.txt
model.safetensors
Modelfile
special_tokens_map.json
tokenizer.json
tokenizer_config.json
vocab.json
- MaskGCT
semantic_codec
model.safetensors
- w2v-bert-2.0
.gitattributes
config.json
conformer_shaw.pt
model.safetensors
preprocessor_config.json
README.md