--- license: cc-by-4.0 language: - as - bn - brx - doi - kn - mai - ml - mr - ne - pa - sa - ta - te library_name: transformers pipeline_tag: text-to-speech tags: - text-to-speech --- # VITS TTS for Indian Languages This repository contains a VITS-based Text-to-Speech (TTS) model fine-tuned for Indian languages. The model supports multiple Indian languages and a wide range of speaking styles and emotions, making it suitable for diverse use cases such as conversational AI, audiobooks, and more. --- ## Model Overview The model `ai4bharat/vits_rasa_13` is based on the VITS architecture and supports the following features: - **Languages**: Multiple Indian languages. - **Styles**: Various speaking styles and emotions. - **Speaker IDs**: Predefined speaker profiles for male and female voices. --- ## Installation ```bash pip install transformers torch ``` --- ## Usage Here's a quick example to get started: ```python import soundfile as sf from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained("ai4bharat/vits_rasa_13", trust_remote_code=True).to("cuda") tokenizer = AutoTokenizer.from_pretrained("ai4bharat/vits_rasa_13", trust_remote_code=True) text = "ਕੀ ਮੈਂ ਇਸ ਹਫਤੇ ਦੇ ਅੰਤ ਵਿੱਚ ਰੁੱਝਿਆ ਹੋਇਆ ਹਾਂ?" # Example text in Punjabi speaker_id = 16 # PAN_M style_id = 0 # ALEXA inputs = tokenizer(text=text, return_tensors="pt").to("cuda") outputs = model(inputs['input_ids'], speaker_id=speaker_id, emotion_id=style_id) sf.write("audio.wav", outputs.waveform.squeeze(), model.config.sampling_rate) print(outputs.waveform.shape) ``` --- ## Supported Languages - `Assamese` - `Bengali` - `Bodo` - `Dogri` - `Kannada` - `Maithili` - `Malayalam` - `Marathi` - `Nepali` - `Punjabi` - `Sanskrit` - `Tamil` - `Telugu` --- ## Speaker-Style Identifier Overview

Speaker Name	Speaker ID
ASM_F	0
ASM_M	1
BEN_F	2
BEN_M	3
BRX_F	4
BRX_M	5
DOI_F	6
DOI_M	7
KAN_F	8
KAN_M	9
MAI_M	10
MAL_F	11
MAR_F	12
MAR_M	13
NEP_F	14
PAN_F	15
PAN_M	16
SAN_M	17
TAM_F	18
TEL_F	19

Style Name	Style ID
ALEXA	0
ANGER	1
BB	2
BOOK	3
CONV	4
DIGI	5
DISGUST	6
FEAR	7
HAPPY	8
NEWS	10
SAD	12
SURPRISE	14
UMANG	15
WIKI	16

--- ## Citation If you use this model in your research, please cite: ```bibtex @article{ai4bharat_vits_rasa_13, title={VITS TTS for Indian Languages}, author={Ashwin Sankar}, year={2024}, publisher={Hugging Face} } ```