mp1704
's Collections
maymo
updated
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
104
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
40
ViTAR: Vision Transformer with Any Resolution
Paper
•
2403.18361
•
Published
•
52
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
Models
Paper
•
2403.18814
•
Published
•
45
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
•
78
LLM Agent Operating System
Paper
•
2403.16971
•
Published
•
65
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
Math Problems?
Paper
•
2403.14624
•
Published
•
51
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
62
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
67
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper
•
2403.09611
•
Published
•
125
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
•
2403.03507
•
Published
•
183
SaulLM-7B: A pioneering Large Language Model for Law
Paper
•
2403.03883
•
Published
•
77
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper
•
2403.03163
•
Published
•
93
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
605
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
114
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
Language Models
Paper
•
2402.13064
•
Published
•
47
Chain-of-Thought Reasoning Without Prompting
Paper
•
2402.10200
•
Published
•
104
OLMo: Accelerating the Science of Language Models
Paper
•
2402.00838
•
Published
•
82
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
91
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
88
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
105
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
126
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper
•
2405.00732
•
Published
•
119
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
•
2405.01535
•
Published
•
120