matlok
's Collections
Papers - Text - Encoders
updated
BERT: Pre-training of Deep Bidirectional Transformers for Language
Understanding
Paper
•
1810.04805
•
Published
•
16
Transformers Can Achieve Length Generalization But Not Robustly
Paper
•
2402.09371
•
Published
•
13
Triple-Encoders: Representations That Fire Together, Wire Together
Paper
•
2402.12332
•
Published
•
2
BERTs are Generative In-Context Learners
Paper
•
2406.04823
•
Published
•
1
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Paper
•
2105.13626
•
Published
•
3
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
•
2412.13663
•
Published
•
120