Thomas Wolf's picture

Thomas Wolf PRO

thomwolf

·

https://thomwolf.io

AI & ML interests

NLP and open-source :-)

Recent Activity

updated a Space about 4 hours ago

science/README

reacted to lewtun's post with 🔥 about 21 hours ago

I was initially pretty sceptical about Meta's Coconut paper [1] because the largest perf gains were reported on toy linguistic problems. However, these results on machine translation are pretty impressive! https://x.com/casper_hansen_/status/1875872309996855343 Together with the recent PRIME method [2] for scaling RL, reasoning for open models is looking pretty exciting for 2025! [1] https://huggingface.co/papers/2412.06769 [2] https://huggingface.co/blog/ganqu/prime

liked a model 1 day ago

deepseek-ai/DeepSeek-V3

View all activity

Articles

Introducing smolagents: simple agents that write actions in code.

FineWeb2-C: Help Build Better Language Models in Your Language

LeMaterial: an open source initiative to accelerate materials discovery and research

FineVideo: behind the scenes

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

A failed experiment: Infini-Attention, and why we should keep trying?

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Constitutional AI with Open LLMs

Open LLM Leaderboard: DROP deep dive

What's going on with the Open LLM Leaderboard?

Can foundation models label data like humans?

Organizations

thomwolf's activity

upvoted an article 1 day ago

Article

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

By

•

4 days ago

• 30

upvoted a collection 1 day ago

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated Nov 14, 2024 • 543

upvoted a paper 1 day ago

Phi-4 Technical Report

Paper • 2412.08905 • Published 26 days ago • 97

upvoted a paper 2 days ago

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published Dec 5, 2024 • 57

upvoted a collection 12 days ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 18 days ago • 75

upvoted an article 14 days ago

Article

FineWeb2-C: Help Build Better Language Models in Your Language

By

•

14 days ago

• 11

upvoted a collection 19 days ago

TabuLa-8B

Training, eval suite, and model from the paper "Large Scale Transfer Learning for Tabular Data via Language Modeling" https://arxiv.org/abs/2406.12031 • 4 items • Updated Jun 19, 2024 • 11

upvoted a paper 20 days ago

LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment

Paper • 2412.04814 • Published Dec 6, 2024 • 45

upvoted a paper 21 days ago

Solving Quantitative Reasoning Problems with Language Models

Paper • 2206.14858 • Published Jun 29, 2022 • 1

upvoted a collection 24 days ago

GUI agents

A collection of papers on GUI agents • 3 items • Updated 24 days ago • 5

upvoted a paper 24 days ago

AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

Paper • 2412.09605 • Published 25 days ago • 26

upvoted a collection 26 days ago

🥂 FineWeb2

3 items • Updated 30 days ago • 11

upvoted a collection about 1 month ago

🤖 Agents

21 items • Updated 6 days ago • 80

upvoted 2 papers about 1 month ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 77

Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations

Paper • 2411.00640 • Published Nov 1, 2024 • 3

upvoted an article about 2 months ago

Article

The Rise of Agentic Data Generation

By

•

Jul 15, 2024

• 80

upvoted a paper about 2 months ago

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

Paper • 2410.21969 • Published Oct 29, 2024 • 9

upvoted a paper 2 months ago

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published Nov 4, 2024 • 24

upvoted an article 2 months ago

Article

Breaking Barriers: The Critical Role of Art and Design in Advancing AI Capabilities

By

•

Jan 15, 2024

• 3

upvoted a collection 3 months ago

LoLCATS

Linearizing LLMs with high quality and efficiency. We linearize the full Llama 3.1 model family -- 8b, 70b, 405b -- for the first time! • 4 items • Updated Oct 14, 2024 • 15