SentenceTransformer based on distilbert/distilbert-base-multilingual-cased

This is a sentence-transformers model finetuned from distilbert/distilbert-base-multilingual-cased on the default dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    '1 Concrete is typically measured by cubic yards (3â\x80\x99x3â\x80\x99x3â\x80\x99). 2  An average cost for a cubic yard of concrete is $75 to $125, depending on how much is needed and local prices. 3  Labor costs to pour and form concrete run somewhere around $3.50 to $7.00 per square foot. An average cost for a cubic yard of concrete is $75 to $125, depending on how much is needed and local prices. 2  Labor costs to pour and form concrete run somewhere around $3.50 to $7.00 per square foot.',
    '1 Beton biasanya diukur dengan meter kubik (3âÂâââx3Ãâ¢Ã‚ââx3âÂââ„¢). 2 Biaya rata-rata untuk satu yard kubik beton adalah $75 sampai $125, tergantung pada berapa banyak yang dibutuhkan dan harga setempat. 3 Biaya tenaga kerja untuk menuangkan dan membentuk beton berkisar antara $3,50 hingga $7,00 per kaki persegi. Biaya rata-rata untuk satu yard kubik beton adalah $75 sampai $125, tergantung pada berapa banyak yang dibutuhkan dan harga lokal. 2 Biaya tenaga kerja untuk menuangkan dan membentuk beton berkisar antara $3,50 hingga $7,00 per kaki persegi.',
    "Parrot Tattoos - Polly ingin cracker.. Ungkapan ini identik dengan 'parrot', terutama yang duduk di bahu bajak laut, seperti yang dibuat terkenal dalam cerita klasik Robert Louis Stevenson, Treasure Island (1883).lint', yang mungkin tidak adalah burung beo pertama yang diasosiasikan dengan nama 'Polly', tapi dia pasti mempopulerkannya. Dan Polly tentu memastikan burung beo itu akan menjadi simbol ikonik dari tradisi bajak laut. Sebagai pendamping legendaris bagi manusia, burung beo menyarankan semacam wali.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Knowledge Distillation

Metric Value
negative_mse -3.5554

Translation

Metric Value
src2trg_accuracy 0.9894
trg2src_accuracy 0.9861
mean_accuracy 0.9877

Training Details

Training Dataset

default

  • Dataset: default at c8bc0cb
  • Size: 1,000,000 training samples
  • Columns: english, indonesian, and label
  • Approximate statistics based on the first 1000 samples:
    english indonesian label
    type string string list
    details
    • min: 4 tokens
    • mean: 44.27 tokens
    • max: 128 tokens
    • min: 5 tokens
    • mean: 48.93 tokens
    • max: 128 tokens
    • size: 768 elements
  • Samples:
    english indonesian label
    This sample job description shares how one smaller sized, growing, multi-site nonprofit organization configured the role of executive director.The executive director is responsible for general management as well as designing a national expansion plan. There also is a heavy emphasis on program evaluation.Feel free to use this sample job description in creating one for your organization.osition. Reporting to the Board of Directors, the Executive Director (ED) will have overall strategic and operational responsibility for XYZ Nonprofit's staff, programs, expansion, and execution of its mission. S/he will initially develop deep knowledge of field, core programs, operations, and business plans. Uraian tugas contoh ini membagikan bagaimana satu organisasi nirlaba multi-situs berukuran lebih kecil, berkembang, mengonfigurasi peran direktur eksekutif. Direktur eksekutif bertanggung jawab atas manajemen umum serta merancang rencana ekspansi nasional. Ada juga penekanan berat pada evaluasi program. Jangan ragu untuk menggunakan contoh deskripsi pekerjaan ini dalam membuat satu untuk posisi organisasi Anda. Melaporkan kepada Dewan Direksi, Direktur Eksekutif (ED) akan memiliki tanggung jawab strategis dan operasional secara keseluruhan untuk staf, program, ekspansi, dan pelaksanaan misi XYZ Nirlaba. Dia awalnya akan mengembangkan pengetahuan yang mendalam tentang lapangan, program inti, operasi, dan rencana bisnis. [-0.4337165653705597, -0.0650932714343071, -0.04308838024735451, -0.1756953001022339, 0.32854965329170227, ...]
    Industrial revolution occured last in Russia. In Germany, France and United States industrial revolution occured in early-to-mid 1800's. While in Russia creation of railroads, and foundation of factories happened by govermental initiatives towards the end of XIX century.n Germany, France and United States industrial revolution occured in early-to-mid 1800's. Revolusi industri terakhir terjadi di Rusia. Di Jerman, Perancis dan Amerika Serikat terjadi revolusi industri pada awal hingga pertengahan 1800-an. Sedangkan di Rusia pembuatan rel kereta api, dan pendirian pabrik terjadi atas inisiatif pemerintah menjelang akhir abad XIX. Revolusi industri Jerman, Prancis dan Amerika Serikat terjadi pada awal hingga pertengahan 1800-an. [-0.22887374460697174, -0.17583712935447693, 0.08270637691020966, -0.15496928989887238, -0.18010610342025757, ...]
    what causes hordeolum internum left lower eyelid apa penyebab hordeolum internum kelopak mata kiri bawah [-0.19872592389583588, 0.4119395911693573, 0.3756648004055023, -0.4884617030620575, 0.15375499427318573, ...]
  • Loss: MSELoss

Evaluation Dataset

default

  • Dataset: default at c8bc0cb
  • Size: 1,000,000 evaluation samples
  • Columns: english, indonesian, and label
  • Approximate statistics based on the first 1000 samples:
    english indonesian label
    type string string list
    details
    • min: 5 tokens
    • mean: 46.58 tokens
    • max: 128 tokens
    • min: 5 tokens
    • mean: 51.0 tokens
    • max: 128 tokens
    • size: 768 elements
  • Samples:
    english indonesian label
    do appraisers give adjustments for lot size apakah penilai memberikan penyesuaian untuk ukuran lot? [0.12256570905447006, 0.011573846451938152, -0.19426874816417694, -0.17596185207366943, 0.35024771094322205, ...]
    hotels in binghamton ny hotel di binghamton ny [0.14259624481201172, -0.048470016568899155, 0.1078888401389122, 0.06728225946426392, 0.6096671223640442, ...]
    guitarist kenny greenberg gitaris kenny greenberg [-0.6973275542259216, 0.27737292647361755, -0.09295299649238586, 0.24035970866680145, 0.154855415225029, ...]
  • Loss: MSELoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 2e-05
  • num_train_epochs: 5
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss default loss default_negative_mse default_mean_accuracy
0.0065 100 0.1968 - - -
0.0129 200 0.1797 - - -
0.0194 300 0.1596 - - -
0.0259 400 0.1367 - - -
0.0323 500 0.1167 - - -
0.0388 600 0.103 - - -
0.0453 700 0.0954 - - -
0.0517 800 0.0909 - - -
0.0582 900 0.088 - - -
0.0646 1000 0.0861 - - -
0.0711 1100 0.0847 - - -
0.0776 1200 0.082 - - -
0.0840 1300 0.0818 - - -
0.0905 1400 0.0813 - - -
0.0970 1500 0.0804 - - -
0.1034 1600 0.0817 - - -
0.1099 1700 0.0799 - - -
0.1164 1800 0.0804 - - -
0.1228 1900 0.0802 - - -
0.1293 2000 0.0791 - - -
0.1358 2100 0.0789 - - -
0.1422 2200 0.0783 - - -
0.1487 2300 0.0783 - - -
0.1551 2400 0.077 - - -
0.1616 2500 0.0762 - - -
0.1681 2600 0.0762 - - -
0.1745 2700 0.0754 - - -
0.1810 2800 0.075 - - -
0.1875 2900 0.0735 - - -
0.1939 3000 0.0745 - - -
0.2004 3100 0.0739 - - -
0.2069 3200 0.0732 - - -
0.2133 3300 0.0724 - - -
0.2198 3400 0.0727 - - -
0.2263 3500 0.0726 - - -
0.2327 3600 0.071 - - -
0.2392 3700 0.0713 - - -
0.2457 3800 0.0708 - - -
0.2521 3900 0.0704 - - -
0.2586 4000 0.0703 - - -
0.2650 4100 0.0704 - - -
0.2715 4200 0.0695 - - -
0.2780 4300 0.068 - - -
0.2844 4400 0.0681 - - -
0.2909 4500 0.0683 - - -
0.2974 4600 0.0674 - - -
0.3038 4700 0.0683 - - -
0.3103 4800 0.0674 - - -
0.3168 4900 0.0674 - - -
0.3232 5000 0.0666 - - -
0.3297 5100 0.0677 - - -
0.3362 5200 0.066 - - -
0.3426 5300 0.0655 - - -
0.3491 5400 0.0658 - - -
0.3555 5500 0.0658 - - -
0.3620 5600 0.0646 - - -
0.3685 5700 0.0638 - - -
0.3749 5800 0.065 - - -
0.3814 5900 0.0648 - - -
0.3879 6000 0.0636 - - -
0.3943 6100 0.0637 - - -
0.4008 6200 0.0636 - - -
0.4073 6300 0.0633 - - -
0.4137 6400 0.0629 - - -
0.4202 6500 0.0638 - - -
0.4267 6600 0.0625 - - -
0.4331 6700 0.0615 - - -
0.4396 6800 0.062 - - -
0.4461 6900 0.062 - - -
0.4525 7000 0.0614 - - -
0.4590 7100 0.0622 - - -
0.4654 7200 0.061 - - -
0.4719 7300 0.06 - - -
0.4784 7400 0.0606 - - -
0.4848 7500 0.0606 - - -
0.4913 7600 0.0597 - - -
0.4978 7700 0.0598 - - -
0.5042 7800 0.0594 - - -
0.5107 7900 0.0596 - - -
0.5172 8000 0.0584 - - -
0.5236 8100 0.0589 - - -
0.5301 8200 0.0587 - - -
0.5366 8300 0.059 - - -
0.5430 8400 0.0592 - - -
0.5495 8500 0.058 - - -
0.5560 8600 0.0576 - - -
0.5624 8700 0.0577 - - -
0.5689 8800 0.0575 - - -
0.5753 8900 0.0576 - - -
0.5818 9000 0.0575 - - -
0.5883 9100 0.0567 - - -
0.5947 9200 0.0568 - - -
0.6012 9300 0.0558 - - -
0.6077 9400 0.0558 - - -
0.6141 9500 0.0563 - - -
0.6206 9600 0.0565 - - -
0.6271 9700 0.0547 - - -
0.6335 9800 0.0555 - - -
0.6400 9900 0.0551 - - -
0.6465 10000 0.055 - - -
0.6529 10100 0.0553 - - -
0.6594 10200 0.0548 - - -
0.6658 10300 0.0542 - - -
0.6723 10400 0.0551 - - -
0.6788 10500 0.0545 - - -
0.6852 10600 0.0545 - - -
0.6917 10700 0.0542 - - -
0.6982 10800 0.0538 - - -
0.7046 10900 0.0532 - - -
0.7111 11000 0.0534 - - -
0.7176 11100 0.053 - - -
0.7240 11200 0.0534 - - -
0.7305 11300 0.0532 - - -
0.7370 11400 0.0535 - - -
0.7434 11500 0.0533 - - -
0.7499 11600 0.0532 - - -
0.7564 11700 0.053 - - -
0.7628 11800 0.0526 - - -
0.7693 11900 0.0527 - - -
0.7757 12000 0.053 - - -
0.7822 12100 0.0522 - - -
0.7887 12200 0.0521 - - -
0.7951 12300 0.0524 - - -
0.8016 12400 0.0518 - - -
0.8081 12500 0.0521 - - -
0.8145 12600 0.0516 - - -
0.8210 12700 0.0517 - - -
0.8275 12800 0.0511 - - -
0.8339 12900 0.0517 - - -
0.8404 13000 0.0516 - - -
0.8469 13100 0.0516 - - -
0.8533 13200 0.0509 - - -
0.8598 13300 0.0508 - - -
0.8662 13400 0.0506 - - -
0.8727 13500 0.0507 - - -
0.8792 13600 0.0507 - - -
0.8856 13700 0.0503 - - -
0.8921 13800 0.0504 - - -
0.8986 13900 0.0506 - - -
0.9050 14000 0.0507 - - -
0.9115 14100 0.0503 - - -
0.9180 14200 0.0496 - - -
0.9244 14300 0.0498 - - -
0.9309 14400 0.0499 - - -
0.9374 14500 0.0504 - - -
0.9438 14600 0.0493 - - -
0.9503 14700 0.0495 - - -
0.9568 14800 0.0493 - - -
0.9632 14900 0.0494 - - -
0.9697 15000 0.0495 - - -
0.9761 15100 0.0496 - - -
0.9826 15200 0.0486 - - -
0.9891 15300 0.0491 - - -
0.9955 15400 0.0485 - - -
1.0 15469 - 0.0479 -4.9594 0.9693
1.0020 15500 0.0487 - - -
1.0085 15600 0.0488 - - -
1.0149 15700 0.0482 - - -
1.0214 15800 0.0486 - - -
1.0279 15900 0.0487 - - -
1.0343 16000 0.0487 - - -
1.0408 16100 0.0484 - - -
1.0473 16200 0.0478 - - -
1.0537 16300 0.0478 - - -
1.0602 16400 0.048 - - -
1.0666 16500 0.048 - - -
1.0731 16600 0.048 - - -
1.0796 16700 0.0478 - - -
1.0860 16800 0.0478 - - -
1.0925 16900 0.0478 - - -
1.0990 17000 0.0472 - - -
1.1054 17100 0.048 - - -
1.1119 17200 0.047 - - -
1.1184 17300 0.0477 - - -
1.1248 17400 0.0476 - - -
1.1313 17500 0.0473 - - -
1.1378 17600 0.0474 - - -
1.1442 17700 0.0472 - - -
1.1507 17800 0.0473 - - -
1.1572 17900 0.0468 - - -
1.1636 18000 0.047 - - -
1.1701 18100 0.0471 - - -
1.1765 18200 0.0467 - - -
1.1830 18300 0.0464 - - -
1.1895 18400 0.0463 - - -
1.1959 18500 0.047 - - -
1.2024 18600 0.0463 - - -
1.2089 18700 0.0466 - - -
1.2153 18800 0.0458 - - -
1.2218 18900 0.0465 - - -
1.2283 19000 0.0466 - - -
1.2347 19100 0.0459 - - -
1.2412 19200 0.0464 - - -
1.2477 19300 0.0457 - - -
1.2541 19400 0.0459 - - -
1.2606 19500 0.0463 - - -
1.2671 19600 0.0458 - - -
1.2735 19700 0.0463 - - -
1.2800 19800 0.0449 - - -
1.2864 19900 0.0455 - - -
1.2929 20000 0.0457 - - -
1.2994 20100 0.0455 - - -
1.3058 20200 0.0456 - - -
1.3123 20300 0.0453 - - -
1.3188 20400 0.0453 - - -
1.3252 20500 0.0454 - - -
1.3317 20600 0.0458 - - -
1.3382 20700 0.0449 - - -
1.3446 20800 0.0449 - - -
1.3511 20900 0.0454 - - -
1.3576 21000 0.0448 - - -
1.3640 21100 0.0445 - - -
1.3705 21200 0.0445 - - -
1.3769 21300 0.045 - - -
1.3834 21400 0.0448 - - -
1.3899 21500 0.0444 - - -
1.3963 21600 0.0446 - - -
1.4028 21700 0.0446 - - -
1.4093 21800 0.0444 - - -
1.4157 21900 0.0449 - - -
1.4222 22000 0.0447 - - -
1.4287 22100 0.044 - - -
1.4351 22200 0.0444 - - -
1.4416 22300 0.044 - - -
1.4481 22400 0.0443 - - -
1.4545 22500 0.0443 - - -
1.4610 22600 0.0445 - - -
1.4675 22700 0.0436 - - -
1.4739 22800 0.0438 - - -
1.4804 22900 0.0441 - - -
1.4868 23000 0.0437 - - -
1.4933 23100 0.0434 - - -
1.4998 23200 0.0437 - - -
1.5062 23300 0.0435 - - -
1.5127 23400 0.0437 - - -
1.5192 23500 0.043 - - -
1.5256 23600 0.0434 - - -
1.5321 23700 0.0436 - - -
1.5386 23800 0.0439 - - -
1.5450 23900 0.0438 - - -
1.5515 24000 0.0433 - - -
1.5580 24100 0.0429 - - -
1.5644 24200 0.0433 - - -
1.5709 24300 0.0428 - - -
1.5773 24400 0.0434 - - -
1.5838 24500 0.0432 - - -
1.5903 24600 0.0433 - - -
1.5967 24700 0.0426 - - -
1.6032 24800 0.0426 - - -
1.6097 24900 0.0425 - - -
1.6161 25000 0.0432 - - -
1.6226 25100 0.043 - - -
1.6291 25200 0.042 - - -
1.6355 25300 0.0427 - - -
1.6420 25400 0.0425 - - -
1.6485 25500 0.0422 - - -
1.6549 25600 0.0428 - - -
1.6614 25700 0.0423 - - -
1.6679 25800 0.0422 - - -
1.6743 25900 0.0425 - - -
1.6808 26000 0.0424 - - -
1.6872 26100 0.0426 - - -
1.6937 26200 0.0422 - - -
1.7002 26300 0.0419 - - -
1.7066 26400 0.0416 - - -
1.7131 26500 0.0421 - - -
1.7196 26600 0.0416 - - -
1.7260 26700 0.0422 - - -
1.7325 26800 0.0418 - - -
1.7390 26900 0.0425 - - -
1.7454 27000 0.0421 - - -
1.7519 27100 0.0421 - - -
1.7584 27200 0.0418 - - -
1.7648 27300 0.042 - - -
1.7713 27400 0.0419 - - -
1.7777 27500 0.0423 - - -
1.7842 27600 0.0415 - - -
1.7907 27700 0.0413 - - -
1.7971 27800 0.0423 - - -
1.8036 27900 0.0413 - - -
1.8101 28000 0.0414 - - -
1.8165 28100 0.0418 - - -
1.8230 28200 0.0414 - - -
1.8295 28300 0.0411 - - -
1.8359 28400 0.0418 - - -
1.8424 28500 0.0416 - - -
1.8489 28600 0.0417 - - -
1.8553 28700 0.041 - - -
1.8618 28800 0.0413 - - -
1.8683 28900 0.0409 - - -
1.8747 29000 0.0413 - - -
1.8812 29100 0.0413 - - -
1.8876 29200 0.0411 - - -
1.8941 29300 0.0408 - - -
1.9006 29400 0.0415 - - -
1.9070 29500 0.0415 - - -
1.9135 29600 0.0408 - - -
1.9200 29700 0.0407 - - -
1.9264 29800 0.0409 - - -
1.9329 29900 0.0414 - - -
1.9394 30000 0.0409 - - -
1.9458 30100 0.0407 - - -
1.9523 30200 0.0404 - - -
1.9588 30300 0.0408 - - -
1.9652 30400 0.0409 - - -
1.9717 30500 0.0409 - - -
1.9781 30600 0.0408 - - -
1.9846 30700 0.0403 - - -
1.9911 30800 0.0403 - - -
1.9975 30900 0.0405 - - -
2.0 30938 - 0.0394 -4.1528 0.9835
2.0040 31000 0.0407 - - -
2.0105 31100 0.0403 - - -
2.0169 31200 0.0401 - - -
2.0234 31300 0.0404 - - -
2.0299 31400 0.0406 - - -
2.0363 31500 0.0408 - - -
2.0428 31600 0.0402 - - -
2.0493 31700 0.0402 - - -
2.0557 31800 0.0398 - - -
2.0622 31900 0.0403 - - -
2.0687 32000 0.0401 - - -
2.0751 32100 0.0405 - - -
2.0816 32200 0.0401 - - -
2.0880 32300 0.04 - - -
2.0945 32400 0.0399 - - -
2.1010 32500 0.0398 - - -
2.1074 32600 0.0406 - - -
2.1139 32700 0.0397 - - -
2.1204 32800 0.0403 - - -
2.1268 32900 0.0399 - - -
2.1333 33000 0.0401 - - -
2.1398 33100 0.0401 - - -
2.1462 33200 0.0403 - - -
2.1527 33300 0.0399 - - -
2.1592 33400 0.0398 - - -
2.1656 33500 0.0399 - - -
2.1721 33600 0.0398 - - -
2.1786 33700 0.0395 - - -
2.1850 33800 0.0395 - - -
2.1915 33900 0.0396 - - -
2.1979 34000 0.0399 - - -
2.2044 34100 0.0398 - - -
2.2109 34200 0.0393 - - -
2.2173 34300 0.0393 - - -
2.2238 34400 0.0399 - - -
2.2303 34500 0.0393 - - -
2.2367 34600 0.0398 - - -
2.2432 34700 0.0394 - - -
2.2497 34800 0.0392 - - -
2.2561 34900 0.0397 - - -
2.2626 35000 0.0399 - - -
2.2691 35100 0.0393 - - -
2.2755 35200 0.0394 - - -
2.2820 35300 0.0389 - - -
2.2884 35400 0.0392 - - -
2.2949 35500 0.0393 - - -
2.3014 35600 0.0393 - - -
2.3078 35700 0.0393 - - -
2.3143 35800 0.0391 - - -
2.3208 35900 0.0389 - - -
2.3272 36000 0.0398 - - -
2.3337 36100 0.0394 - - -
2.3402 36200 0.0389 - - -
2.3466 36300 0.0388 - - -
2.3531 36400 0.0392 - - -
2.3596 36500 0.0386 - - -
2.3660 36600 0.039 - - -
2.3725 36700 0.0387 - - -
2.3790 36800 0.0391 - - -
2.3854 36900 0.0389 - - -
2.3919 37000 0.0389 - - -
2.3983 37100 0.0387 - - -
2.4048 37200 0.0388 - - -
2.4113 37300 0.0387 - - -
2.4177 37400 0.0391 - - -
2.4242 37500 0.039 - - -
2.4307 37600 0.0384 - - -
2.4371 37700 0.0388 - - -
2.4436 37800 0.0385 - - -
2.4501 37900 0.0388 - - -
2.4565 38000 0.039 - - -
2.4630 38100 0.0387 - - -
2.4695 38200 0.0382 - - -
2.4759 38300 0.0384 - - -
2.4824 38400 0.0388 - - -
2.4888 38500 0.0381 - - -
2.4953 38600 0.0384 - - -
2.5018 38700 0.0384 - - -
2.5082 38800 0.0383 - - -
2.5147 38900 0.0382 - - -
2.5212 39000 0.0381 - - -
2.5276 39100 0.0382 - - -
2.5341 39200 0.0384 - - -
2.5406 39300 0.0387 - - -
2.5470 39400 0.0384 - - -
2.5535 39500 0.0381 - - -
2.5600 39600 0.038 - - -
2.5664 39700 0.0384 - - -
2.5729 39800 0.0379 - - -
2.5794 39900 0.0385 - - -
2.5858 40000 0.0381 - - -
2.5923 40100 0.0382 - - -
2.5987 40200 0.0377 - - -
2.6052 40300 0.0375 - - -
2.6117 40400 0.038 - - -
2.6181 40500 0.0384 - - -
2.6246 40600 0.0378 - - -
2.6311 40700 0.0379 - - -
2.6375 40800 0.0376 - - -
2.6440 40900 0.0378 - - -
2.6505 41000 0.0376 - - -
2.6569 41100 0.0381 - - -
2.6634 41200 0.0374 - - -
2.6699 41300 0.0377 - - -
2.6763 41400 0.038 - - -
2.6828 41500 0.0377 - - -
2.6892 41600 0.0379 - - -
2.6957 41700 0.0377 - - -
2.7022 41800 0.0373 - - -
2.7086 41900 0.0374 - - -
2.7151 42000 0.0373 - - -
2.7216 42100 0.0374 - - -
2.7280 42200 0.0375 - - -
2.7345 42300 0.0375 - - -
2.7410 42400 0.0379 - - -
2.7474 42500 0.0379 - - -
2.7539 42600 0.0378 - - -
2.7604 42700 0.0375 - - -
2.7668 42800 0.0375 - - -
2.7733 42900 0.0377 - - -
2.7798 43000 0.0378 - - -
2.7862 43100 0.0372 - - -
2.7927 43200 0.0374 - - -
2.7991 43300 0.0376 - - -
2.8056 43400 0.0374 - - -
2.8121 43500 0.0371 - - -
2.8185 43600 0.0377 - - -
2.8250 43700 0.0368 - - -
2.8315 43800 0.0376 - - -
2.8379 43900 0.0374 - - -
2.8444 44000 0.0378 - - -
2.8509 44100 0.0375 - - -
2.8573 44200 0.0371 - - -
2.8638 44300 0.037 - - -
2.8703 44400 0.0371 - - -
2.8767 44500 0.0374 - - -
2.8832 44600 0.037 - - -
2.8897 44700 0.0374 - - -
2.8961 44800 0.0368 - - -
2.9026 44900 0.0377 - - -
2.9090 45000 0.0375 - - -
2.9155 45100 0.0367 - - -
2.9220 45200 0.0368 - - -
2.9284 45300 0.0372 - - -
2.9349 45400 0.0374 - - -
2.9414 45500 0.0367 - - -
2.9478 45600 0.037 - - -
2.9543 45700 0.0368 - - -
2.9608 45800 0.0367 - - -
2.9672 45900 0.0372 - - -
2.9737 46000 0.0375 - - -
2.9802 46100 0.0368 - - -
2.9866 46200 0.0368 - - -
2.9931 46300 0.0367 - - -
2.9995 46400 0.0366 - - -
3.0 46407 - 0.0357 -3.7998 0.9869
3.0060 46500 0.0372 - - -
3.0125 46600 0.0365 - - -
3.0189 46700 0.0369 - - -
3.0254 46800 0.0368 - - -
3.0319 46900 0.037 - - -
3.0383 47000 0.037 - - -
3.0448 47100 0.0367 - - -
3.0513 47200 0.0364 - - -
3.0577 47300 0.0366 - - -
3.0642 47400 0.0366 - - -
3.0707 47500 0.0371 - - -
3.0771 47600 0.0367 - - -
3.0836 47700 0.0368 - - -
3.0901 47800 0.0366 - - -
3.0965 47900 0.0362 - - -
3.1030 48000 0.0368 - - -
3.1094 48100 0.0366 - - -
3.1159 48200 0.0367 - - -
3.1224 48300 0.0369 - - -
3.1288 48400 0.0366 - - -
3.1353 48500 0.0366 - - -
3.1418 48600 0.0367 - - -
3.1482 48700 0.037 - - -
3.1547 48800 0.0367 - - -
3.1612 48900 0.0362 - - -
3.1676 49000 0.0367 - - -
3.1741 49100 0.0365 - - -
3.1806 49200 0.0363 - - -
3.1870 49300 0.036 - - -
3.1935 49400 0.0366 - - -
3.1999 49500 0.0366 - - -
3.2064 49600 0.0366 - - -
3.2129 49700 0.0361 - - -
3.2193 49800 0.0365 - - -
3.2258 49900 0.0365 - - -
3.2323 50000 0.0361 - - -
3.2387 50100 0.0365 - - -
3.2452 50200 0.0363 - - -
3.2517 50300 0.0362 - - -
3.2581 50400 0.0366 - - -
3.2646 50500 0.0366 - - -
3.2711 50600 0.0367 - - -
3.2775 50700 0.0361 - - -
3.2840 50800 0.0359 - - -
3.2905 50900 0.0363 - - -
3.2969 51000 0.0361 - - -
3.3034 51100 0.0364 - - -
3.3098 51200 0.0363 - - -
3.3163 51300 0.0362 - - -
3.3228 51400 0.0359 - - -
3.3292 51500 0.0368 - - -
3.3357 51600 0.0361 - - -
3.3422 51700 0.0359 - - -
3.3486 51800 0.0362 - - -
3.3551 51900 0.0363 - - -
3.3616 52000 0.0357 - - -
3.3680 52100 0.0358 - - -
3.3745 52200 0.036 - - -
3.3810 52300 0.0365 - - -
3.3874 52400 0.0359 - - -
3.3939 52500 0.0359 - - -
3.4003 52600 0.0362 - - -
3.4068 52700 0.0358 - - -
3.4133 52800 0.036 - - -
3.4197 52900 0.0366 - - -
3.4262 53000 0.036 - - -
3.4327 53100 0.0357 - - -
3.4391 53200 0.036 - - -
3.4456 53300 0.036 - - -
3.4521 53400 0.036 - - -
3.4585 53500 0.0364 - - -
3.4650 53600 0.0359 - - -
3.4715 53700 0.0354 - - -
3.4779 53800 0.0359 - - -
3.4844 53900 0.036 - - -
3.4909 54000 0.0355 - - -
3.4973 54100 0.0358 - - -
3.5038 54200 0.0355 - - -
3.5102 54300 0.036 - - -
3.5167 54400 0.0354 - - -
3.5232 54500 0.0357 - - -
3.5296 54600 0.0356 - - -
3.5361 54700 0.036 - - -
3.5426 54800 0.036 - - -
3.5490 54900 0.0358 - - -
3.5555 55000 0.0356 - - -
3.5620 55100 0.0357 - - -
3.5684 55200 0.0356 - - -
3.5749 55300 0.0358 - - -
3.5814 55400 0.036 - - -
3.5878 55500 0.0356 - - -
3.5943 55600 0.0358 - - -
3.6007 55700 0.0351 - - -
3.6072 55800 0.0352 - - -
3.6137 55900 0.0357 - - -
3.6201 56000 0.0359 - - -
3.6266 56100 0.035 - - -
3.6331 56200 0.0357 - - -
3.6395 56300 0.0354 - - -
3.6460 56400 0.0352 - - -
3.6525 56500 0.0356 - - -
3.6589 56600 0.0356 - - -
3.6654 56700 0.0349 - - -
3.6719 56800 0.0358 - - -
3.6783 56900 0.0355 - - -
3.6848 57000 0.0353 - - -
3.6913 57100 0.0355 - - -
3.6977 57200 0.0353 - - -
3.7042 57300 0.035 - - -
3.7106 57400 0.0351 - - -
3.7171 57500 0.035 - - -
3.7236 57600 0.0353 - - -
3.7300 57700 0.0353 - - -
3.7365 57800 0.0356 - - -
3.7430 57900 0.0356 - - -
3.7494 58000 0.0355 - - -
3.7559 58100 0.0355 - - -
3.7624 58200 0.0354 - - -
3.7688 58300 0.0353 - - -
3.7753 58400 0.0357 - - -
3.7818 58500 0.0353 - - -
3.7882 58600 0.035 - - -
3.7947 58700 0.0355 - - -
3.8012 58800 0.035 - - -
3.8076 58900 0.0355 - - -
3.8141 59000 0.0351 - - -
3.8205 59100 0.0353 - - -
3.8270 59200 0.0349 - - -
3.8335 59300 0.0355 - - -
3.8399 59400 0.0353 - - -
3.8464 59500 0.0357 - - -
3.8529 59600 0.0351 - - -
3.8593 59700 0.0351 - - -
3.8658 59800 0.0352 - - -
3.8723 59900 0.035 - - -
3.8787 60000 0.0353 - - -
3.8852 60100 0.0351 - - -
3.8917 60200 0.0352 - - -
3.8981 60300 0.0351 - - -
3.9046 60400 0.0356 - - -
3.9110 60500 0.0352 - - -
3.9175 60600 0.0347 - - -
3.9240 60700 0.035 - - -
3.9304 60800 0.0352 - - -
3.9369 60900 0.0356 - - -
3.9434 61000 0.0346 - - -
3.9498 61100 0.0352 - - -
3.9563 61200 0.0349 - - -
3.9628 61300 0.0349 - - -
3.9692 61400 0.0354 - - -
3.9757 61500 0.0354 - - -
3.9822 61600 0.0348 - - -
3.9886 61700 0.0349 - - -
3.9951 61800 0.0347 - - -
4.0 61876 - 0.0339 -3.6284 0.9876
4.0016 61900 0.0351 - - -
4.0080 62000 0.035 - - -
4.0145 62100 0.0348 - - -
4.0209 62200 0.0349 - - -
4.0274 62300 0.0352 - - -
4.0339 62400 0.0351 - - -
4.0403 62500 0.0352 - - -
4.0468 62600 0.0347 - - -
4.0533 62700 0.0347 - - -
4.0597 62800 0.0348 - - -
4.0662 62900 0.035 - - -
4.0727 63000 0.035 - - -
4.0791 63100 0.0349 - - -
4.0856 63200 0.035 - - -
4.0921 63300 0.0349 - - -
4.0985 63400 0.0346 - - -
4.1050 63500 0.035 - - -
4.1114 63600 0.0347 - - -
4.1179 63700 0.0351 - - -
4.1244 63800 0.0351 - - -
4.1308 63900 0.035 - - -
4.1373 64000 0.0349 - - -
4.1438 64100 0.0352 - - -
4.1502 64200 0.0351 - - -
4.1567 64300 0.0348 - - -
4.1632 64400 0.0347 - - -
4.1696 64500 0.0352 - - -
4.1761 64600 0.0346 - - -
4.1826 64700 0.0345 - - -
4.1890 64800 0.0346 - - -
4.1955 64900 0.0351 - - -
4.2020 65000 0.0348 - - -
4.2084 65100 0.035 - - -
4.2149 65200 0.0345 - - -
4.2213 65300 0.0349 - - -
4.2278 65400 0.0351 - - -
4.2343 65500 0.0345 - - -
4.2407 65600 0.035 - - -
4.2472 65700 0.0346 - - -
4.2537 65800 0.0347 - - -
4.2601 65900 0.0351 - - -
4.2666 66000 0.0347 - - -
4.2731 66100 0.0354 - - -
4.2795 66200 0.0342 - - -
4.2860 66300 0.0345 - - -
4.2925 66400 0.0349 - - -
4.2989 66500 0.0347 - - -
4.3054 66600 0.0347 - - -
4.3118 66700 0.0348 - - -
4.3183 66800 0.0347 - - -
4.3248 66900 0.0346 - - -
4.3312 67000 0.0353 - - -
4.3377 67100 0.0345 - - -
4.3442 67200 0.0343 - - -
4.3506 67300 0.035 - - -
4.3571 67400 0.0346 - - -
4.3636 67500 0.0343 - - -
4.3700 67600 0.0344 - - -
4.3765 67700 0.0348 - - -
4.3830 67800 0.0348 - - -
4.3894 67900 0.0345 - - -
4.3959 68000 0.0347 - - -
4.4024 68100 0.0345 - - -
4.4088 68200 0.0346 - - -
4.4153 68300 0.0349 - - -
4.4217 68400 0.0349 - - -
4.4282 68500 0.0345 - - -
4.4347 68600 0.0346 - - -
4.4411 68700 0.0345 - - -
4.4476 68800 0.0347 - - -
4.4541 68900 0.0346 - - -
4.4605 69000 0.035 - - -
4.4670 69100 0.0343 - - -
4.4735 69200 0.0346 - - -
4.4799 69300 0.0346 - - -
4.4864 69400 0.0346 - - -
4.4929 69500 0.0342 - - -
4.4993 69600 0.0346 - - -
4.5058 69700 0.0342 - - -
4.5123 69800 0.0348 - - -
4.5187 69900 0.0341 - - -
4.5252 70000 0.0344 - - -
4.5316 70100 0.0345 - - -
4.5381 70200 0.0348 - - -
4.5446 70300 0.0349 - - -
4.5510 70400 0.0344 - - -
4.5575 70500 0.0342 - - -
4.5640 70600 0.0346 - - -
4.5704 70700 0.0342 - - -
4.5769 70800 0.0345 - - -
4.5834 70900 0.0347 - - -
4.5898 71000 0.0345 - - -
4.5963 71100 0.0343 - - -
4.6028 71200 0.0341 - - -
4.6092 71300 0.0341 - - -
4.6157 71400 0.0347 - - -
4.6221 71500 0.0347 - - -
4.6286 71600 0.0339 - - -
4.6351 71700 0.0344 - - -
4.6415 71800 0.0342 - - -
4.6480 71900 0.0342 - - -
4.6545 72000 0.0346 - - -
4.6609 72100 0.0342 - - -
4.6674 72200 0.0341 - - -
4.6739 72300 0.0344 - - -
4.6803 72400 0.0345 - - -
4.6868 72500 0.0345 - - -
4.6933 72600 0.0342 - - -
4.6997 72700 0.0341 - - -
4.7062 72800 0.034 - - -
4.7127 72900 0.0343 - - -
4.7191 73000 0.0337 - - -
4.7256 73100 0.0343 - - -
4.7320 73200 0.0343 - - -
4.7385 73300 0.0346 - - -
4.7450 73400 0.0346 - - -
4.7514 73500 0.0345 - - -
4.7579 73600 0.0343 - - -
4.7644 73700 0.0344 - - -
4.7708 73800 0.0345 - - -
4.7773 73900 0.0347 - - -
4.7838 74000 0.034 - - -
4.7902 74100 0.034 - - -
4.7967 74200 0.0348 - - -
4.8032 74300 0.0338 - - -
4.8096 74400 0.0346 - - -
4.8161 74500 0.0344 - - -
4.8225 74600 0.0342 - - -
4.8290 74700 0.034 - - -
4.8355 74800 0.0346 - - -
4.8419 74900 0.0346 - - -
4.8484 75000 0.0346 - - -
4.8549 75100 0.034 - - -
4.8613 75200 0.0343 - - -
4.8678 75300 0.034 - - -
4.8743 75400 0.0344 - - -
4.8807 75500 0.0344 - - -
4.8872 75600 0.0342 - - -
4.8937 75700 0.0341 - - -
4.9001 75800 0.0345 - - -
4.9066 75900 0.0347 - - -
4.9131 76000 0.0341 - - -
4.9195 76100 0.0339 - - -
4.9260 76200 0.0343 - - -
4.9324 76300 0.0346 - - -
4.9389 76400 0.0344 - - -
4.9454 76500 0.0341 - - -
4.9518 76600 0.034 - - -
4.9583 76700 0.0342 - - -
4.9648 76800 0.0344 - - -
4.9712 76900 0.0344 - - -
4.9777 77000 0.0343 - - -
4.9842 77100 0.0341 - - -
4.9906 77200 0.0341 - - -
4.9971 77300 0.0342 - - -
5.0 77345 - 0.0331 -3.5554 0.9877

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.3.1
  • Transformers: 4.46.3
  • PyTorch: 2.4.0
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MSELoss

@inproceedings{reimers-2020-multilingual-sentence-bert,
    title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2004.09813",
}
Downloads last month
67
Safetensors
Model size
135M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for AryoshiW/distilbert-en-id-qa

Finetuned
(230)
this model

Dataset used to train AryoshiW/distilbert-en-id-qa

Evaluation results