{MODEL_NAME}
This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)
Usage (HuggingFace Transformers)
Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
from transformers import AutoTokenizer, AutoModel
import torch
#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output[0] #First element of model_output contains all token embeddings
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']
# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
model = AutoModel.from_pretrained('{MODEL_NAME}')
# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
# Compute token embeddings
with torch.no_grad():
model_output = model(**encoded_input)
# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print("Sentence embeddings:")
print(sentence_embeddings)
Evaluation Results
For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net
Model | Avg | id_raw_acc | vn_raw_acc | br_raw_acc | th_raw_acc | my_raw_acc | ph_raw_acc | sg_raw_acc |
---|---|---|---|---|---|---|---|---|
thtang_ALL_679283 | 66.39 | 72.37 | 61.8 | 56.94 | 65.27 | 69.71 | 69.21 | 69.44 |
thtang_ALL_660924 | 66.44 | 72.63 | 61.74 | 57.22 | 65.44 | 69.77 | 69.06 | 69.23 |
sentence-transformers_sentence-t5-xxl | 44.35 | 50.98 | 18.38 | 36.37 | 16.91 | 59.25 | 64.82 | 63.75 |
sentence-transformers_gtr-t5-xxl | 46.68 | 59.93 | 24.82 | 40.79 | 17.23 | 58.41 | 64.0 | 61.57 |
sentence-transformers_LaBSE | 45.68 | 50.3 | 32.82 | 33.15 | 39.79 | 54.95 | 53.71 | 55.06 |
sentence-transformers_all-MiniLM-L6-v2 | 41.97 | 50.8 | 25.76 | 27.04 | 15.81 | 54.63 | 60.07 | 59.68 |
sentence-transformers_all-mpnet-base-v2 | 40.09 | 46.97 | 23.15 | 24.75 | 16.31 | 52.66 | 59.07 | 57.75 |
sentence-transformers_all-MiniLM-L12-v2 | 41.28 | 48.98 | 24.05 | 25.74 | 16.41 | 54.51 | 60.38 | 58.9 |
sentence-transformers_paraphrase-MiniLM-L6-v2 | 39.12 | 44.92 | 23.59 | 26.12 | 14.23 | 51.84 | 57.14 | 56.03 |
sentence-transformers_paraphrase-mpnet-base-v2 | 39.7 | 46.0 | 20.45 | 26.92 | 14.75 | 52.89 | 58.71 | 58.2 |
sentence-transformers_paraphrase-multilingual-MiniLM-L12-v2 | 43.72 | 44.88 | 28.32 | 29.45 | 36.4 | 53.97 | 56.87 | 56.14 |
sentence-transformers_paraphrase-multilingual-mpnet-base-v2 | 46.12 | 49.03 | 32.58 | 32.82 | 38.43 | 55.3 | 57.36 | 57.34 |
sentence-transformers_all-distilroberta-v1 | 39.46 | 46.74 | 22.34 | 24.06 | 17.59 | 51.49 | 57.54 | 56.45 |
sentence-transformers_distiluse-base-multilingual-cased-v2 | 40.53 | 43.51 | 23.86 | 28.41 | 26.9 | 53.14 | 53.54 | 54.38 |
sentence-transformers_clip-ViT-B-32-multilingual-v1 | 40.82 | 44.45 | 27.34 | 28.0 | 28.25 | 50.3 | 54.05 | 53.39 |
intfloat_e5-large-v2 | 45.07 | 55.1 | 28.06 | 35.95 | 17.16 | 57.16 | 61.21 | 60.84 |
intfloat_e5-small-v2 | 42.84 | 51.41 | 26.82 | 33.04 | 16.3 | 54.97 | 58.66 | 58.68 |
intfloat_e5-large | 45.91 | 55.45 | 28.54 | 36.69 | 18.15 | 57.78 | 62.92 | 61.83 |
intfloat_e5-small | 43.14 | 51.31 | 27.36 | 32.05 | 16.66 | 55.15 | 60.39 | 59.06 |
intfloat_multilingual-e5-large | 49.76 | 52.99 | 42.0 | 33.92 | 47.69 | 55.82 | 57.76 | 58.16 |
intfloat_multilingual-e5-base | 49.57 | 52.06 | 43.21 | 34.17 | 47.41 | 55.28 | 57.38 | 57.45 |
intfloat_multilingual-e5-small | 48.35 | 49.5 | 42.68 | 30.96 | 47.42 | 54.44 | 56.44 | 57.04 |
BAAI_bge-large-en-v1.5 | 43.56 | 49.81 | 25.55 | 30.68 | 17.41 | 56.89 | 62.87 | 61.72 |
BAAI_bge-base-en-v1.5 | 43.42 | 51.73 | 24.3 | 31.51 | 17.53 | 56.21 | 62.37 | 60.25 |
BAAI_bge-small-en-v1.5 | 43.07 | 51.37 | 25.16 | 29.99 | 16.13 | 56.17 | 61.69 | 61.01 |
thenlper_gte-large | 46.31 | 55.1 | 28.16 | 33.96 | 18.73 | 59.5 | 65.19 | 63.52 |
thenlper_gte-base | 45.3 | 55.46 | 27.88 | 32.77 | 17.2 | 58.09 | 63.68 | 62.03 |
llmrails_ember-v1 | 43.79 | 50.85 | 24.76 | 31.02 | 17.2 | 57.62 | 63.06 | 62.04 |
infgrad_stella-base-en-v2 | 44.23 | 52.42 | 26.24 | 30.61 | 18.81 | 56.84 | 63.03 | 61.67 |
Training
The model was trained with the parameters:
DataLoader:
torch.utils.data.dataloader.DataLoader
of length 1468721 with parameters:
{'batch_size': 160, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
Loss:
sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss
Parameters of the fit()-Method:
{
"epochs": 1,
"evaluation_steps": 0,
"evaluator": "NoneType",
"max_grad_norm": 1,
"optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
"optimizer_params": {
"lr": 2e-05
},
"scheduler": "WarmupLinear",
"steps_per_epoch": null,
"warmup_steps": 100,
"weight_decay": 0.01
}
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)
Citing & Authors
- Downloads last month
- 22