Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Paper • 2411.07140 • Published Nov 11, 2024 • 33
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models Paper • 2410.11710 • Published Oct 15, 2024 • 19
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models Paper • 2401.06951 • Published Jan 13, 2024 • 25