does the huggingface version of this model split across a pair of 3090s like the jax implementation does?
#2
by
gestalt73
- opened
does the huggingface version of this model split across a pair of 3090s like the (sorry, not jax) github implementation does?
Yes
stellaathena
changed discussion status to
closed