LeMaterial: an open source initiative to accelerate materials discovery and research 28 days ago β’ 32
view post Post 1474 I was initially pretty sceptical about Meta's Coconut paper [1] because the largest perf gains were reported on toy linguistic problems. However, these results on machine translation are pretty impressive!https://x.com/casper_hansen_/status/1875872309996855343Together with the recent PRIME method [2] for scaling RL, reasoning for open models is looking pretty exciting for 2025![1] Training Large Language Models to Reason in a Continuous Latent Space (2412.06769)[2] https://huggingface.co/blog/ganqu/prime See translation π₯ 7 7 π§ 1 1 + Reply
view article Article πΊπ¦ββ¬ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark By wolfram β’ 4 days ago β’ 30
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. β’ 26 items β’ Updated Nov 14, 2024 β’ 543
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper β’ 2412.04454 β’ Published Dec 5, 2024 β’ 57
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. β’ 40 items β’ Updated 18 days ago β’ 75
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien β’ 14 days ago β’ 11