mesolitica/mallam-1.1B-4096
Text Generation
•
Updated
•
378
•
5
Pretrain from scratch 4096 context length on 90B tokens Malaysian text, https://huggingface.co/papers/2401.14680