nilq
's Collections
Dynamics of Transformer Language Model Features
updated
Model soups: averaging weights of multiple fine-tuned models improves
accuracy without increasing inference time
Paper
•
2203.05482
•
Published
•
6
Diverse Weight Averaging for Out-of-Distribution Generalization
Paper
•
2205.09739
•
Published
•
1
Fusing finetuned models for better pretraining
Paper
•
2204.03044
•
Published
•
5
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and
Simplicity Bias in MLMs
Paper
•
2309.07311
•
Published
•
3
Steering Llama 2 via Contrastive Activation Addition
Paper
•
2312.06681
•
Published
•
11
Knowledge Fusion of Large Language Models
Paper
•
2401.10491
•
Published
•
3
ReAGent: Towards A Model-agnostic Feature Attribution Method for
Generative Language Models
Paper
•
2402.00794
•
Published
•
1
Resolving Interference When Merging Models
Paper
•
2306.01708
•
Published
•
13
Tracking Universal Features Through Fine-Tuning and Model Merging
Paper
•
2410.12391
•
Published
•
5