sentence-transformers
/

static-retrieval-mrl-en-v1

Sentence Similarity

sentence-transformers

feature-extraction

Generated from Trainer

dataset_size:80543469

loss:MatryoshkaLoss

loss:MultipleNegativesRankingLoss

Inference Endpoints

Model card Files Files and versions Community

Tom Aarsen commited on Nov 29, 2024

Commit

e2b264e

·

1 Parent(s): 96cb24d

Reformat README somewhat

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -8630,12 +8630,12 @@ You can finetune this model on your own dataset.
 | cosine_mrr@10       | 0.5482     |
 | cosine_map@100      | 0.4203     |
-We've evaluated [sentence-transformers/static-retrieval-mrl-en-v1](https://huggingface.co/sentence-transformers/static-retrieval-mrl-en-v1) on NanoBEIR and plotted it against the inferenec speed computed on my [hardware](#hardware-details). For the inference speed tests, we calculated the number of computed query embeddings of the [GooAQ dataset](https://huggingface.co/datasets/sentence-transformers/gooaq) per second, either on CPU or GPU.
 We evaluate against 3 types of models:
 1. Attention-based dense embedding models, e.g. traditional Sentence Transformer models like [`all-mpnet-base-v2`](https://huggingface.co/sentence-transformers/all-mpnet-base-v2), [`bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5), and [`gte-large-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5).
 2. Static Embedding-based models, e.g. [`static-retrieval-mrl-en-v1`](https://huggingface.co/sentence-transformers/static-retrieval-mrl-en-v1), [`potion-base-8M`](https://huggingface.co/minishlab/potion-base-8M), [`M2V_base_output`](https://huggingface.co/minishlab/M2V_base_output), and [`glove.6B.300d`](https://huggingface.co/sentence-transformers/average_word_embeddings_glove.6B.300d).
-3. Sparse bag-of-words model, BM25, often a difficult baseline.
     <details><summary>Click to expand BM25 implementation details</summary>

 | cosine_mrr@10       | 0.5482     |
 | cosine_map@100      | 0.4203     |
+We've evaluated [sentence-transformers/static-retrieval-mrl-en-v1](https://huggingface.co/sentence-transformers/static-retrieval-mrl-en-v1) on NanoBEIR and plotted it against the inference speed computed on a RTX 3090 and i7-13700K. For the inference speed tests, we calculated the number of computed query embeddings of the [GooAQ dataset](https://huggingface.co/datasets/sentence-transformers/gooaq) per second, either on CPU or GPU.
 We evaluate against 3 types of models:
 1. Attention-based dense embedding models, e.g. traditional Sentence Transformer models like [`all-mpnet-base-v2`](https://huggingface.co/sentence-transformers/all-mpnet-base-v2), [`bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5), and [`gte-large-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5).
 2. Static Embedding-based models, e.g. [`static-retrieval-mrl-en-v1`](https://huggingface.co/sentence-transformers/static-retrieval-mrl-en-v1), [`potion-base-8M`](https://huggingface.co/minishlab/potion-base-8M), [`M2V_base_output`](https://huggingface.co/minishlab/M2V_base_output), and [`glove.6B.300d`](https://huggingface.co/sentence-transformers/average_word_embeddings_glove.6B.300d).
+3. Sparse bag-of-words model, BM25, often a strong baseline.
     <details><summary>Click to expand BM25 implementation details</summary>