view article Article Welcome FalconMamba: The first strong attention-free 7B model Aug 12, 2024 • 108
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion Paper • 2407.01392 • Published Jul 1, 2024 • 39
Wavelets Are All You Need for Autoregressive Image Generation Paper • 2406.19997 • Published Jun 28, 2024 • 29
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods Jan 18, 2024 • 41
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29, 2024 • 49