-
sentence-transformers/all-mpnet-base-v2
Sentence Similarity • Updated • 19.1M • 946 -
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper • 1910.10683 • Published • 10 -
google-t5/t5-base
Translation • Updated • 1.9M • • 655 -
Attention Is All You Need
Paper • 1706.03762 • Published • 50
Collections
Discover the best community collections!
Collections including paper arxiv:1910.10683
-
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper • 1910.10683 • Published • 10 -
AutoTrain: No-code training for state-of-the-art models
Paper • 2410.15735 • Published • 59 -
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper • 2405.00732 • Published • 119 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 31
-
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 86 -
Long-form factuality in large language models
Paper • 2403.18802 • Published • 24 -
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion
Paper • 2403.18818 • Published • 24 -
TC4D: Trajectory-Conditioned Text-to-4D Generation
Paper • 2403.17920 • Published • 16
-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper • 2402.14083 • Published • 47 -
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Paper • 2305.13245 • Published • 5 -
Training a T5 Using Lab-sized Resources
Paper • 2208.12097 • Published • 1 -
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Paper • 2212.05055 • Published • 5
-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 42 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 52 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 15 -
Reformer: The Efficient Transformer
Paper • 2001.04451 • Published
-
bigcode/the-stack
Viewer • Updated • 546M • 5.26k • 757 -
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper • 1910.10683 • Published • 10 -
allenai/c4
Viewer • Updated • 10.4B • 322k • 346 -
allenai/ai2_arc
Viewer • Updated • 7.79k • 109k • 163
-
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 10 -
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Paper • 1909.08053 • Published • 2 -
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper • 1910.10683 • Published • 10 -
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Paper • 2304.01373 • Published • 9
-
SMOTE: Synthetic Minority Over-sampling Technique
Paper • 1106.1813 • Published • 1 -
Scikit-learn: Machine Learning in Python
Paper • 1201.0490 • Published • 1 -
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Paper • 1406.1078 • Published -
Distributed Representations of Sentences and Documents
Paper • 1405.4053 • Published