Jamba: A Hybrid Transformer-Mamba Language Model Paper • 2403.19887 • Published Mar 28, 2024 • 104 • 5
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 65 • 4