SDPO: Segment-Level Direct Preference Optimization for Social Agents Paper • 2501.01821 • Published 3 days ago • 10
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published 3 days ago • 19
Fewer-token Neural Speech Codec with Time-invariant Codes Paper • 2310.00014 • Published Sep 15, 2023 • 2
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning Paper • 2412.15797 • Published 17 days ago • 16
PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models Paper • 2412.18608 • Published 13 days ago • 12
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper • 2412.14711 • Published 19 days ago • 15
In Case You Missed It: ARC 'Challenge' Is Not That Challenging Paper • 2412.17758 • Published 14 days ago • 16
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 4 days ago • 41
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 5 days ago • 82
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 10 days ago • 70
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation Paper • 2412.18597 • Published 13 days ago • 19
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression Paper • 2412.17483 • Published 14 days ago • 29
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 14 days ago • 37
The Superposition of Diffusion Models Using the Itô Density Estimator Paper • 2412.17762 • Published 14 days ago • 12
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published 22 days ago • 49
Large Concept Models: Language Modeling in a Sentence Representation Space Paper • 2412.08821 • Published 26 days ago • 11