metadata
language:
- ms
- ta
- zh
- id
library_name: transformers
base_model:
- mesolitica/nanot5-base-malaysian-cased
pipeline_tag: translation
NanoT5 Base Malaysian Translation V2.1
Finetuned https://huggingface.co/mesolitica/nanot5-base-malaysian-cased using 2048 context length on 9B tokens of translation dataset.
- This model able to translate from localize text into standard text.
- This model able to reverse translate from standard to localize text, suitable for text augmentation.
- This model able to translate code.
- This model natively code switching.
- This model should maintain
\n
,\t
,\r
as it is. - Better Science and Math context translation compared to V2.
- Better Manglish translation compared to V2.
- Better Cantonese translation compared to V2.
- Better Tamil and Tanglish translation compared to V2.
Wandb at https://wandb.ai/huseinzol05/nanot5-base-malaysian-cased-translation-v6-multipack-post