Maya: An Instruction Finetuned Multilingual Multimodal Model Paper • 2412.07112 • Published 28 days ago • 26
Reasoning Datasets Collection Reasoning datasets that are trending 🔥 • 10 items • Updated 3 days ago • 17
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 18 days ago • 75
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 18 days ago • 116
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 15 days ago • 208
OLMo 2 Collection Artifacts for the second set of OLMo models. • 22 items • Updated about 1 hour ago • 69
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 15 days ago • 197
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding Paper • 2408.11049 • Published Aug 20, 2024 • 12
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • Oct 17, 2024 • 55
⛈️ Llama-3.1 Storm Models Collection Fine-tuned Llama 3.1 8B model with superior reasoning, conversation abilities, and function calling! • 3 items • Updated Aug 25, 2024 • 15
Code Evaluation Collection Collection of Papers on Code Evaluation (from code generation language models) • 45 items • Updated Oct 29, 2024 • 15
Llama-3.1 Quantization Collection Neural Magic quantized Llama-3.1 models • 22 items • Updated Nov 22, 2024 • 42
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 638