Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks Paper • 2412.15605 • Published 30 days ago • 2
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8, 2024 • 156
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 170