--- license: cc-by-4.0 language: - as - bn - brx - doi - kn - mai - ml - mr - ne - pa - sa - ta - te library_name: transformers pipeline_tag: text-to-speech tags: - text-to-speech --- # VITS TTS for Indian Languages This repository contains a VITS-based Text-to-Speech (TTS) model fine-tuned for Indian languages. The model supports multiple Indian languages and a wide range of speaking styles and emotions, making it suitable for diverse use cases such as conversational AI, audiobooks, and more. --- ## Model Overview The model `ai4bharat/vits_rasa_13` is based on the VITS architecture and supports the following features: - **Languages**: Multiple Indian languages. - **Styles**: Various speaking styles and emotions. - **Speaker IDs**: Predefined speaker profiles for male and female voices. --- ## Installation ```bash pip install transformers torch ``` --- ## Usage Here's a quick example to get started: ```python import soundfile as sf from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained("ai4bharat/vits_rasa_13", trust_remote_code=True).to("cuda") tokenizer = AutoTokenizer.from_pretrained("ai4bharat/vits_rasa_13", trust_remote_code=True) text = "ਕੀ ਮੈਂ ਇਸ ਹਫਤੇ ਦੇ ਅੰਤ ਵਿੱਚ ਰੁੱਝਿਆ ਹੋਇਆ ਹਾਂ?" # Example text in Punjabi speaker_id = 16 # PAN_M style_id = 0 # ALEXA inputs = tokenizer(text=text, return_tensors="pt").to("cuda") outputs = model(inputs['input_ids'], speaker_id=speaker_id, emotion_id=style_id) sf.write("audio.wav", outputs.waveform.squeeze(), model.config.sampling_rate) print(outputs.waveform.shape) ``` --- ## Supported Languages - `Assamese` - `Bengali` - `Bodo` - `Dogri` - `Kannada` - `Maithili` - `Malayalam` - `Marathi` - `Nepali` - `Punjabi` - `Sanskrit` - `Tamil` - `Telugu` --- ## Speaker-Style Identifier Overview
Speaker Name Speaker ID
ASM_F 0
ASM_M 1
BEN_F 2
BEN_M 3
BRX_F 4
BRX_M 5
DOI_F 6
DOI_M 7
KAN_F 8
KAN_M 9
MAI_M 10
MAL_F 11
MAR_F 12
MAR_M 13
NEP_F 14
PAN_F 15
PAN_M 16
SAN_M 17
TAM_F 18
TEL_F 19
Style Name Style ID
ALEXA 0
ANGER 1
BB 2
BOOK 3
CONV 4
DIGI 5
DISGUST 6
FEAR 7
HAPPY 8
NEWS 10
SAD 12
SURPRISE 14
UMANG 15
WIKI 16
--- ## Citation If you use this model in your research, please cite: ```bibtex @article{ai4bharat_vits_rasa_13, title={VITS TTS for Indian Languages}, author={Ashwin Sankar}, year={2024}, publisher={Hugging Face} } ```