Whisper Large v3 Turbo Nepali - Kiran Pantha

This model is a fine-tuned version of openai/whisper-large-v3 on the OpenSLR54 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0876
  • Wer: 18.7250
  • Cer: 4.4861

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
0.2266 0.1200 300 11.9034 0.2345 44.7619
0.208 0.2399 600 11.3157 0.2132 41.1060
0.185 0.3599 900 9.4204 0.1753 35.6068
0.1567 0.4798 1200 8.8596 0.1634 33.9324
0.1411 0.5998 1500 8.7004 0.1523 33.0568
0.1377 0.7197 1800 7.3120 0.1371 29.7849
0.1147 0.8397 2100 7.0010 0.1332 27.7112
0.1116 0.9596 2400 6.5798 0.1212 26.3287
0.0757 1.0796 2700 6.1268 0.1193 24.7773
0.0609 1.1995 3000 5.8991 0.1154 24.6237
0.0612 1.3195 3300 5.2599 0.1091 22.0737
0.0627 1.4394 3600 5.3579 0.1045 21.6283
0.0582 1.5594 3900 5.1938 0.0995 21.5054
0.0551 1.6793 4200 4.7947 0.0956 19.8771
0.052 1.7993 4500 4.5473 0.0897 19.1244
0.0438 1.9192 4800 4.4861 0.0876 18.7250

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cxx11.abi
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
34
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kiranpantha/whisper-large-v3-nepali

Finetuned
(365)
this model

Evaluation results