Training Data?

#32

by binarymax - opened 8 days ago

8 days ago

Hi! Excellent work on this model. Can you please share more information on the training data used? The sources are quite vague, and it would be good to know more specifics to understand what content/domains this might better align with than others.

NohTow

3 days ago

Hello,

Unfortunately, this is the most we can share about the data, I am deeply sorry about this.
Hopefully the broad domains and experiments can give signals about the domains ModernBERT is aligned with ; the contents in themselves should be quite diverse.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment