Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 71
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published 12 days ago • 86
Scaling Test-Time Compute with Open Models Collection Models and datasets used in our blog post: https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute • 10 items • Updated about 18 hours ago • 19
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 54
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 28 days ago • 68
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 17 items • Updated 15 days ago • 93
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 15 days ago • 208
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31, 2024 • 76
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth By mlabonne • Jul 29, 2024 • 260
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains Paper • 2407.18961 • Published Jul 18, 2024 • 40
Research projects on top of vLLM Collection Papers cited in https://blog.vllm.ai/2024/07/25/lfai-perf.html • 6 items • Updated Jul 29, 2024 • 12
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 638
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated 26 days ago • 38
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 6 items • Updated Jul 21, 2024 • 69
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 354