ModernBART wen?

#38
by Fizzarolli - opened

Title is /j, but in all seriousness is there any interest out there in producing a BART/T5-like encoder-decoder model with the improvements here? (flash attn, rope, etc)

Fizzarolli changed discussion status to closed
Fizzarolli changed discussion status to open

(misclick xD)

Sign up or log in to comment