Omar Sanseviero's picture

Omar Sanseviero

osanseviero

·

https://osanseviero.github.io/hackerllama/

AI & ML interests

Llamas, model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.🦙

Recent Activity

liked a Space about 4 hours ago

reach-vb/2024-ai-timeline

updated a collection about 5 hours ago

Papers I want to read

upvoted a collection about 5 hours ago

View all activity

Articles

Llama can now see and run on your device - welcome Llama 3.2

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

WWDC 24: Running Mistral 7B with Core ML

How we leveraged distilabel to create an Argilla 2.0 Chatbot

Welcome Gemma 2 - Google's new open LLM

Welcome Llama 3 - Meta's new open LLM

CodeGemma - an official Google release for code LLMs

🪆 Introduction to Matryoshka Embedding Models

Welcome Gemma - Google's new open LLM

Constitutional AI with Open LLMs

Preference Tuning LLMs with Direct Preference Optimization Methods

Mixture of Experts Explained

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Inference for PROs

Spread Your Wings: Falcon 180B is here

Code Llama: Llama 2 learns to code

Results of the Open Source AI Game Jam

Llama 2 is here - get it on Hugging Face

The Falcon has landed in the Hugging Face ecosystem

Hugging Face Machine Learning Demos on arXiv

What's new in Diffusers? 🎨

Announcing Evaluation on the Hub

An Introduction to Deep Reinforcement Learning

Welcome spaCy to the 🤗 Hub

Sentence Transformers in the 🤗 Hub

Organizations

osanseviero's activity

upvoted a collection about 5 hours ago

🤖 Agents

21 items • Updated 5 days ago • 74

upvoted a paper about 8 hours ago

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 70

upvoted 4 papers 1 day ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published 2 days ago • 37

A3: Android Agent Arena for Mobile GUI Agents

Paper • 2501.01149 • Published 3 days ago • 20

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 3 days ago • 64

2 OLMo 2 Furious

Paper • 2501.00656 • Published 4 days ago • 9

upvoted 6 papers 2 days ago

HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

Paper • 2412.21199 • Published 5 days ago • 9

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Paper • 2412.20070 • Published 8 days ago • 39

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Paper • 2412.18525 • Published 11 days ago • 60

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published 8 days ago • 67

Xmodel-2 Technical Report

Paper • 2412.19638 • Published 8 days ago • 21

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

Paper • 2412.20735 • Published 6 days ago • 8

upvoted 4 papers 7 days ago

OpenAI o1 System Card

Paper • 2412.16720 • Published 14 days ago • 29

In Case You Missed It: ARC 'Challenge' Is Not That Challenging

Paper • 2412.17758 • Published 12 days ago • 16

ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

Paper • 2412.14711 • Published 17 days ago • 15

YuLan-Mini: An Open Data-efficient Language Model

Paper • 2412.17743 • Published 12 days ago • 59

upvoted a paper 9 days ago

RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement

Paper • 2412.12881 • Published 19 days ago • 1

upvoted an article 11 days ago

Article

🌁#81: Key AI Concepts to Follow in 2025

By

•

12 days ago

• 18

upvoted a paper 12 days ago

DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought

Paper • 2412.17498 • Published 13 days ago • 21

upvoted a paper 16 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 16 days ago • 334