dragonkue
/

BGE-m3-ko

Sentence Similarity

sentence-transformers

feature-extraction

Generated from Trainer

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

dragonkue commited on Sep 30, 2024

Commit

e2d2fcf

·

verified ·

1 Parent(s): 35e3754

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -418,7 +418,7 @@ This is a benchmark of Korean embedding models.
 ## Bias, Risks and Limitations
-- Since the evaluation results are different for each domain, it is necessary to compare and evaluate the model in your own domain. In the Miracl benchmark, the evaluation was conducted using the Korean Wikipedia as a corpus, and in this case, the cosine_ndcg@10 score dropped by 0.2 points after learning. However, in the Auto-RAG benchmark, which is a financial domain, the ndcg score increased by 0.9 when it was top 1. This model may be advantageous for use in a specific domain.
 - Also, since the miracl benchmark consists of a corpus of relatively short strings, while the Korean Embedding Benchmark consists of a corpus of longer strings, this model may be more advantageous if the length of the corpus you want to use is long.

 ## Bias, Risks and Limitations
+- Since the evaluation results are different for each domain, it is necessary to compare and evaluate the model in your own domain. In the Miracl benchmark, the evaluation was conducted using the Korean Wikipedia as a corpus, and in this case, the cosine_ndcg@10 score dropped by 0.02 points after learning. However, in the Auto-RAG benchmark, which is a financial domain, the ndcg score increased by 0.09 when it was top 1. This model may be advantageous for use in a specific domain.
 - Also, since the miracl benchmark consists of a corpus of relatively short strings, while the Korean Embedding Benchmark consists of a corpus of longer strings, this model may be more advantageous if the length of the corpus you want to use is long.