File size: 485 Bytes
82a2165
 
 
 
 
 
32e8d65
 
da2a281
 
32e8d65
 
82a2165
1
2
3
4
5
6
7
8
9
10
11
12
13
---
datasets:
- EleutherAI/lambada_openai
---


*Data influence models for [LAMBADA](https://huggingface.co/datasets/EleutherAI/lambada_openai) fine-tuned from [bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased).*

The main branch contains the data influence model for 10k steps.

Paper: [MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models](https://arxiv.org/pdf/2406.06046)

Official codebase: https://github.com/cxcscmu/MATES