huseinzol05's picture
Update README.md
cb73e71 verified
metadata
language:
  - ms
  - ta
  - zh
  - id
library_name: transformers
base_model:
  - mesolitica/nanot5-base-malaysian-cased
pipeline_tag: translation

NanoT5 Base Malaysian Translation V2.1

Finetuned https://huggingface.co/mesolitica/nanot5-base-malaysian-cased using 2048 context length on 9B tokens of translation dataset.

  • This model able to translate from localize text into standard text.
  • This model able to reverse translate from standard to localize text, suitable for text augmentation.
  • This model able to translate code.
  • This model natively code switching.
  • This model should maintain \n, \t, \r as it is.
  • Better Science and Math context translation compared to V2.
  • Better Manglish translation compared to V2.
  • Better Cantonese translation compared to V2.
  • Better Tamil and Tanglish translation compared to V2.

Wandb at https://wandb.ai/huseinzol05/nanot5-base-malaysian-cased-translation-v6-multipack-post