Dynamics of Transformer Language Model Features - a nilq Collection

nilq 's Collections

Dynamics of Transformer Language Model Features

Toy Models to Study

Merged Toy Models

Toy Base Models

Dynamics of Transformer Language Model Features

updated Oct 17, 2024

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 6
Diverse Weight Averaging for Out-of-Distribution Generalization

Paper • 2205.09739 • Published May 19, 2022 • 1
Fusing finetuned models for better pretraining

Paper • 2204.03044 • Published Apr 6, 2022 • 5
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

Paper • 2309.07311 • Published Sep 13, 2023 • 3
Steering Llama 2 via Contrastive Activation Addition

Paper • 2312.06681 • Published Dec 9, 2023 • 11
Knowledge Fusion of Large Language Models

Paper • 2401.10491 • Published Jan 19, 2024 • 3
ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models

Paper • 2402.00794 • Published Feb 1, 2024 • 1
Resolving Interference When Merging Models

Paper • 2306.01708 • Published Jun 2, 2023 • 13
Tracking Universal Features Through Fine-Tuning and Model Merging

Paper • 2410.12391 • Published Oct 16, 2024 • 5