SentenceTransformer based on nomic-ai/modernbert-embed-base

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the sujet-financial-rag-en-dataset dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: nomic-ai/modernbert-embed-base
Maximum Sequence Length: 8192 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- sujet-financial-rag-en-dataset
Language: en

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sujet-ai/Fin-ModernBERT-RAG-base")
# Run inference
sentences = [
    'How does the diversification of investments across different currencies impact financial risk?',
    '20/9/2023 4,504 0.00% GBP 305,720 USD (385,212) JPMorgan Chase Bank 20/9/2023 3,544 0.00% EUR 602,840 USD (659,854) State Street Bank & Trust Co. 20/9/2023 435 0.00% JPY 67,590,000 USD (473,571) JPMorgan Chase Bank 20/9/2023 (176) (0.00%) GBP 378,925 USD (483,052) State Street Bank & Trust Co. 20/9/2023 (1,208) (0.00%) GBP 382,825 USD (488,055) BNP Paribas 20/9/2023 (1,251) (0.00%) EUR 480,370 USD (528,752) State Street Bank & Trust Co. 20/9/2023 (2,604) (0.00%) JPY 68,925,000 USD (489,188) State Street Bank & Trust Co. 20/9/2023 (6,443) (0.00%) JPY 43,800,000 USD (319,166) JPMorgan Chase Bank 20/9/2023 (12,395) (0.00%) JPY 91,700,000 USD (657,807) JPMorgan Chase Bank 20/9/2023 (15,547) (0.00%) JPY 639,066,394 USD (4,648,059) JPMorgan Chase Bank 20/9/2023 (172,087) (0.00%) Total OTC Financial Derivative Instruments 545,977 0.00% Total Investments 17,991,067,179 98.73% Fair Value US Dollars ($)% of Total Net Assets Other Assets and Liabilities 232,296,305 1.27% Net Assets 18,223,363,484 100.00%',
    'In addition, the restriction on liens in the GSFC 2008 Indenture applies only to liens that secure debt for borrowed money. For example, liens imposed by operation of law, such as liens to secure statutory obligations for taxes or workers’ compensation benefits, or liens the Company creates to secure obligations to pay legal judgments or surety bonds, would not be covered by this restriction. Modification of the Debt Indenture and Waiver of Covenants There are four types of changes GSFC and the Company can make to the GSFC 2008 Indenture and the debt securities or series of debt securities and related guarantees issued under the GSFC 2008 Indenture. Changes Requiring Each Holder’s Approval First, there are changes that cannot be made without the approval of the holder of each debt security affected by the change under the GSFC 2008 Indenture. Here is a list of those types of changes: • change the stated maturity for any principal or interest payment on a debt security; • reduce the principal amount, the amount payable on acceleration of the stated maturity after a default, the interest rate or the redemption price for a debt security; • permit redemption of a debt security if not previously permitted; • impair any right a holder may have to require repayment of its debt security; • change the currency of any payment on a debt security; • change the place of payment on a debt security; • impair a holder’s right to sue for payment of any amount due on its debt security; • reduce the percentage in principal amount of the debt securities of any one or more affected series, taken • separately or together, as applicable, and whether comprising the same or different series or less than all of the debt securities of a series, the approval of whose holders is needed to change the applicable debt indenture or those debt securities; • reduce the percentage in principal amount of the debt securities of any one or more affected series, taken separately or together, as applicable, and whether comprising the same or different series or less than all of the debt securities of a series, the consent of whose holders is needed to waive GSFC’s compliance with the applicable debt indenture or to waive defaults; and • change the provisions of the applicable debt indenture dealing with modification and waiver in any other respect, except to increase any required percentage referred to above or to add to -59-',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Dataset: ModernFinBERT-RAG-embed-base
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.3813
cosine_accuracy@3	0.6329
cosine_accuracy@5	0.7124
cosine_accuracy@10	0.7919
cosine_precision@1	0.3813
cosine_precision@3	0.211
cosine_precision@5	0.1425
cosine_precision@10	0.0792
cosine_recall@1	0.3813
cosine_recall@3	0.6329
cosine_recall@5	0.7124
cosine_recall@10	0.7919
cosine_ndcg@10	0.5892
cosine_mrr@10	0.5239
cosine_map@100	0.5298

Training Details

Training Dataset

sujet-financial-rag-en-dataset

Dataset: sujet-financial-rag-en-dataset at ec52315
Size: 104,601 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 13 tokens
mean: 24.56 tokens
max: 50 tokens

min: 23 tokens
mean: 647.39 tokens
max: 1165 tokens

	anchor	positive
type	string	string
details	min: 13 tokens mean: 24.56 tokens max: 50 tokens	min: 23 tokens mean: 647.39 tokens max: 1165 tokens

Samples:

anchor	positive
`How does the Compensation Committee's role influence the stock awards granted to executive officers?`	PART II Item 8 88 Stock Plans Stock awards entitle the holder to receive shares of Microsoft common stock as the award vests. Stock awards generally vest over a service period of four years or five years. Executive Incentive Plan Under the Executive Incentive Plan, the Compensation Committee approves stock awards to executive officers and certain senior executives. RSUs generally vest ratably over a service period of four years. PSUs generally vest over a performance period of thre e years. The number of shares the PSU holder receives is based on the extent to which the corresponding performance goals have been achieved. Activity for All Stock Plans The fair value of stock awards was estimated on the date of grant using the following assumptions: Year ended June 30, 2023 2022 2021 Dividends per share (quarterly amounts) $ 0.62 – 0.68 $ 0.56 – 0.62 $ 0.51 – 0.56 Interest rates 2.0% – 5.4% 0.03% – 3.6% 0.01% – 1.5% During fiscal year 2023 , the following activity occurred under our stock...
`What is the fair value of the bond issued by CVS Health Corp., and how does it compare to the fair value of the bond issued by Walt Disney Co.?`	445 Vanguard ESG Global Corporate Bond UCITS ETF Principal CouponMaturity DateFair Value US Dollars ($)% of Total Net Assets State Street Corp. $50,000 4.82% 26/1/2034 48,557 0.01% Baxalta, Inc. $50,000 4.00% 23/6/2025 48,515 0.01% Starbucks Corp. $50,000 3.80% 15/8/2025 48,426 0.01% Citigroup, Inc. $50,000 4.60% 9/3/2026 48,387 0.01% Athene Global Funding CAD70,000 2.10% 24/9/2025 48,344 0.01% Bank of America Corp. $50,000 4.25% 22/10/2026 48,257 0.01% PepsiCo, Inc. $50,000 3.60% 18/2/2028 48,191 0.01% Charles Schwab Corp. $50,000 3.85% 21/5/2025 48,183 0.01% JPMorgan Chase & Co. $50,000 4.13% 15/12/2026 48,165 0.01% Charter Communications Operating LLC/Charter Communications Operating Capital $60,000 5.50% 1/4/2063 48,151 0.01% US Bancorp $60,000 2.68% 27/1/2033 48,106 0.01% Chubb INA Holdings, Inc. $50,000 3.35% 3/5/2026 48,074 0.01% Bank of New York Mellon Corp. $50,000 3.00% 24/2/2025 48,071 0.01% Truist Financial Corp. $50,000 4.87% 26/1/2029 48,042 0.01% Truist Financial Corp. $...
`Analyze the impact of currency fluctuations on the unrealized gains and losses reported in the forward currency exchange contracts.`	15,216 141,230 0.01% Samsung Fire & Marine Insurance Co., Ltd. - Preference Shares 1,056 137,365 0.01% Samsung SDI Co., Ltd. - Preference Shares 546 133,014 0.01% NHN Corp. 7,096 132,480 0.01% Hanwha Corp. - Preference Shares 10,137 114,475 0.01% Amorepacific Corp. - Preference Shares 4,230 101,123 0.01% CJ CheilJedang Corp. - Preference Shares 576 59,276 0.00% Hanwha Galleria Corp. 47,521 54,711 0.00% - - 386,394,890 29.25% Total Equities 1,291,387,033 97.75% Total Transferable Securities 1,291,387,033 97.75% Number of Contracts Long/ (Short)Notional Amount Unrealised Gain/(Loss) US Dollar s ($)% of Total Net Assets Financial Derivative Instruments Dealt in on a Regulated Market (0.02%) (30 June 2022: (0.00%)) Futures (0.02%) (30 June 2022: (0.00%)) MSCI Pacific Ex-Japan Index September 2023 283 $20,595,251 (131,521) (0.01%) KOSPI 200 Index September 2023 138 KRW11,933,318,478 (141,212) (0.01%) Total Financial Derivative Instruments Dealt in on a Regulated Market (272,733) (0.02%) OTC...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Evaluation Dataset

sujet-financial-rag-en-dataset

Dataset: sujet-financial-rag-en-dataset at ec52315
Size: 1,057 evaluation samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 13 tokens
mean: 24.64 tokens
max: 52 tokens

min: 26 tokens
mean: 647.51 tokens
max: 1081 tokens

	anchor	positive
type	string	string
details	min: 13 tokens mean: 24.64 tokens max: 52 tokens	min: 26 tokens mean: 647.51 tokens max: 1081 tokens

Samples:

anchor	positive
`What was the net asset value per share for the EUR Distributing class as of 30 June 2022?`	The accompanying notes form an integral part of the financial statements.559 Vanguard EUR Eurozone Government Bond UCITS ETFStatement of Assets and Liabilities EUR (€) EUR (€) As at 30 June As at 30 June Note 2023 2022 Current Assets Financial Assets at Fair Value Through Profit or Loss: Transferable Securities 3,17 1,719,130,585 1,249,469,080 Financial Derivative Instruments 3,17 — 23,742 Cash 3 11,990,422 14,558,520 Receivables: Interest and Dividends 12,715,254 5,193,434 Capital Shares Issued 27 9,190,562 Investments Sold 6,621,764 499,630 Margin Cash Due from Broker 3 3 56,198 Total Current Assets 1,750,458,055 1,278,991,166 Current Liabilities Financial Liabilities at Fair Value Through Profit or Loss: Financial Derivative Instruments 3,17 — 17,321 Bank Overdraft — 6,668 Payables and Other Liabilities: Capital Shares Redeemed 5,790,847 6,811,068 Investments Purchased 8,942,689 15,381,189 Management Fees Payable 12 99,689 69,769 Total Current Liabilities 14,833,225 22,286,015 Net A...
`What factors could lead the Committee to determine that an employee's actions have resulted in a "material adverse impact" on the broader financial system?`	Definitions Appendix The following capitalized terms are used in this Award Agreement with the following meanings: (a)“409A Deferred Compensation ” means a “deferral of compensation” or “deferred compensation” as those terms are defined in the regulations under Section 409A. (b)“Conflicted Employment ” means your employment at any U.S. Federal, state or local government, any non-U.S. government, any supranational or international organization, any self- regulatory organization, or any agency or instrumentality of any such government or organization, or any other employer (other than an “Accounting Firm” within the meaning of SEC Rule 2-01(f)(2) of Regulation S-X or any successor thereto) determined by the Committee, if, as a result of such employment, your continued holding of any Outstanding Short-Term RSUs would result in an actual or perceived conflict of interest. (c)“Failed to Consider Risk ” means that you participated (or otherwise oversaw or were responsible for, depending on t...
`What financial implications could arise from a decrease in the pool of qualified drivers for a ridesharing platform?`	In addition, changes in certain laws and regulations, including immigration, labor and employment laws or background check requirements, may result in a shift or decrease in the pool of qualified drivers, which may result in increased competition for qualified drivers or higher costs of recruitment, operation and retention. As part of our business operations or research and development efforts, data on the vehicle may be collected and drivers may be uncomfortable or unwilling to drive knowing that data is being collected. Other factors outside of our control, such as concerns about personal health and safety, increases in the price of gasoline, vehicles or insurance, or concerns about the availability of government or other assistance programs if drivers continue to drive on our platform, may also reduce the number of drivers on our platform or their utilization of our platform, or impact our ability to onboard new drivers. If we fail to attract qualified drivers on favorable terms, fa...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 64
per_device_eval_batch_size: 64
gradient_accumulation_steps: 8
learning_rate: 0.0002
num_train_epochs: 2
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: True
tf32: True
load_best_model_at_end: True
optim: adamw_torch_fused
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 64
per_device_eval_batch_size: 64
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 8
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 0.0002
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 2
max_steps: -1
lr_scheduler_type: cosine
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: True
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	Validation Loss	ModernFinBERT-RAG-embed-base_cosine_ndcg@10
0	0	-	-	0.2812
0.0489	10	1.8949	-	-
0.0979	20	1.0738	-	-
0.1468	30	0.9147	-	-
0.1957	40	0.8194	-	-
0.2446	50	0.7847	-	-
0.2936	60	0.7428	-	-
0.3425	70	0.7587	-	-
0.3914	80	0.7769	-	-
0.4404	90	0.7319	-	-
0.4893	100	0.7199	0.7262	0.5395
0.5382	110	0.7085	-	-
0.5872	120	0.6726	-	-
0.6361	130	0.6954	-	-
0.6850	140	0.65	-	-
0.7339	150	0.6207	-	-
0.7829	160	0.6518	-	-
0.8318	170	0.6227	-	-
0.8807	180	0.6285	-	-
0.9297	190	0.6235	-	-
0.9786	200	0.6183	0.6158	0.5546
1.0294	210	0.6036	-	-
1.0783	220	0.5818	-	-
1.1272	230	0.5445	-	-
1.1761	240	0.5115	-	-
1.2251	250	0.4712	-	-
1.2740	260	0.449	-	-
1.3229	270	0.4457	-	-
1.3719	280	0.4763	-	-
1.4208	290	0.449	-	-
1.4697	300	0.4352	0.5674	0.5797
1.5187	310	0.4173	-	-
1.5676	320	0.4198	-	-
1.6165	330	0.3901	-	-
1.6654	340	0.4066	-	-
1.7144	350	0.3802	-	-
1.7633	360	0.3712	-	-
1.8122	370	0.3983	-	-
1.8612	380	0.3886	-	-
1.9101	390	0.4027	-	-
1.959	400	0.398	0.5435	0.5892

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.13
Sentence Transformers: 3.3.1
Transformers: 4.48.0.dev0
PyTorch: 2.5.1+cu124
Accelerate: 1.0.1
Datasets: 3.2.0
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 6

Safetensors

Model size

149M params

Tensor type

F32

Inference Examples

Sentence Similarity

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sujet-ai/Fin-ModernBERT-RAG-embed-base

Base model

answerdotai/ModernBERT-base

Quantized

nomic-ai/modernbert-embed-base

Finetuned

(3)

this model

Dataset used to train sujet-ai/Fin-ModernBERT-RAG-embed-base

Evaluation results

Cosine Accuracy@1 on ModernFinBERT RAG embed base
self-reported

0.381
Cosine Accuracy@3 on ModernFinBERT RAG embed base
self-reported

0.633
Cosine Accuracy@5 on ModernFinBERT RAG embed base
self-reported

0.712
Cosine Accuracy@10 on ModernFinBERT RAG embed base
self-reported

0.792
Cosine Precision@1 on ModernFinBERT RAG embed base
self-reported

0.381
Cosine Precision@3 on ModernFinBERT RAG embed base
self-reported

0.211
Cosine Precision@5 on ModernFinBERT RAG embed base
self-reported

0.142
Cosine Precision@10 on ModernFinBERT RAG embed base
self-reported

0.079
Cosine Recall@1 on ModernFinBERT RAG embed base
self-reported

0.381
Cosine Recall@3 on ModernFinBERT RAG embed base
self-reported

0.633

View on Papers With Code