xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning Paper β’ 2401.07037 β’ Published Jan 13, 2024 β’ 2
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! Paper β’ 2402.12343 β’ Published Feb 19, 2024
m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt Paper β’ 2403.17556 β’ Published Mar 26, 2024 β’ 1
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis Paper β’ 2404.01204 β’ Published Apr 1, 2024
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Paper β’ 2404.04167 β’ Published Apr 5, 2024 β’ 12
MuPT: A Generative Symbolic Music Pretrained Transformer Paper β’ 2404.06393 β’ Published Apr 9, 2024 β’ 15
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models Paper β’ 2406.01359 β’ Published Jun 3, 2024
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models Paper β’ 2406.01375 β’ Published Jun 3, 2024
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models Paper β’ 2406.05862 β’ Published Jun 9, 2024 β’ 4
UniCoder: Scaling Code Large Language Model via Universal Code Paper β’ 2406.16441 β’ Published Jun 24, 2024 β’ 2
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models Paper β’ 2406.14550 β’ Published Jun 20, 2024 β’ 4
MMRA: A Benchmark for Multi-granularity Multi-image Relational Association Paper β’ 2407.17379 β’ Published Jul 24, 2024 β’ 2
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm Paper β’ 2408.08072 β’ Published Aug 15, 2024 β’ 34
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models Paper β’ 2410.11710 β’ Published Oct 15, 2024 β’ 19
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks Paper β’ 2410.06526 β’ Published Oct 9, 2024
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Paper β’ 2410.20424 β’ Published Oct 27, 2024 β’ 39
Aligning CodeLLMs with Direct Preference Optimization Paper β’ 2410.18585 β’ Published Oct 24, 2024
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Paper β’ 2411.07140 β’ Published Nov 11, 2024 β’ 33
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models Paper β’ 2412.15265 β’ Published 21 days ago