Large Language Model Evaluation via Matrix Nuclear-Norm Paper • 2410.10672 • Published Oct 14, 2024 • 19
One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation Paper • 2410.07170 • Published Oct 9, 2024 • 15
Only-IF:Revealing the Decisive Effect of Instruction Diversity on Generalization Paper • 2410.04717 • Published Oct 7, 2024 • 18
Cottention: Linear Transformers With Cosine Attention Paper • 2409.18747 • Published Sep 27, 2024 • 16
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Paper • 2409.17422 • Published Sep 25, 2024 • 25 • 5
EuroLLM: Multilingual Language Models for Europe Paper • 2409.16235 • Published Sep 24, 2024 • 26 • 4
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published Aug 27, 2024 • 52 • 4
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models Paper • 2408.15518 • Published Aug 28, 2024 • 42 • 4
Better Alignment with Instruction Back-and-Forth Translation Paper • 2408.04614 • Published Aug 8, 2024 • 15 • 3
Better Alignment with Instruction Back-and-Forth Translation Paper • 2408.04614 • Published Aug 8, 2024 • 15
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published Jul 31, 2024 • 76 • 3