yyqoni
/

Phi-3-mini-4k-instruct-segment-rm-700k

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

yyqoni commited on 11 days ago

Commit

cac634a

·

verified ·

1 Parent(s): 1a940af

Update README.md

Files changed (1) hide show

README.md +10 -8

README.md CHANGED Viewed

@@ -1,18 +1,20 @@
----
-library_name: transformers
-license: mit
-datasets:
-- hendrydong/preference_700K
-base_model:
-- microsoft/Phi-3-mini-4k-instruct
----
 # phi-instruct-segment Model Card
 ## Method
 The segment reward model assigns rewards to semantically meaningful text segments, segmented dynamically with an entropy-based threshold. It is trained on binary preference labels from human feedback, optimizing a Bradley-Terry loss function that aggregates segment rewards using the average function.
 <div align=center>
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/605e8dfd5abeb13e714c4c18/GnDEETLQeFpqx7-enIENw.png)
 </div>

+---
+library_name: transformers
+license: mit
+datasets:
+- hendrydong/preference_700K
+base_model:
+- microsoft/Phi-3-mini-4k-instruct
+---
 # phi-instruct-segment Model Card
 ## Method
 The segment reward model assigns rewards to semantically meaningful text segments, segmented dynamically with an entropy-based threshold. It is trained on binary preference labels from human feedback, optimizing a Bradley-Terry loss function that aggregates segment rewards using the average function.
 <div align=center>
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/605e8dfd5abeb13e714c4c18/GnDEETLQeFpqx7-enIENw.png)
 </div>