-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 104 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 40 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 19 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper • 2402.09727 • Published • 36
Collections
Discover the best community collections!
Collections including paper arxiv:2401.04925
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 22 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 17 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 29 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 16
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 62 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 16 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 37 -
Attention Is All You Need
Paper • 1706.03762 • Published • 50
-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 52 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 18 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 19 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 6
-
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 16 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk
Paper • 2401.05033 • Published • 16 -
Towards Conversational Diagnostic AI
Paper • 2401.05654 • Published • 16
-
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 16 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 37 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 114
-
Unicron: Economizing Self-Healing LLM Training at Scale
Paper • 2401.00134 • Published • 9 -
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models
Paper • 2401.00788 • Published • 21 -
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding
Paper • 2401.04398 • Published • 21 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 16
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 62 -
InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes
Paper • 2401.05335 • Published • 27 -
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk
Paper • 2401.05033 • Published • 16 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 16