File size: 1,634 Bytes
cac634a 2660234 cac634a 29b6604 1a940af 29b6604 2660234 5d9a048 2660234 1a940af 29b6604 cac634a 1a940af cac634a 2660234 1a940af 2660234 1a940af 2660234 29b6604 1a940af 29b6604 1a940af 29b6604 1a940af 29b6604 1a940af 29b6604 1a940af |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
library_name: transformers
license: mit
datasets:
- hendrydong/preference_700K
base_model:
- microsoft/Phi-3-mini-4k-instruct
pipeline_tag: text-classification
---
# phi-instruct-segment Model Card
- **Paper:** [Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
](https://arxiv.org/abs/2501.02790)
- **Model:** [yyqoni/Phi-3-mini-4k-instruct-segment-rm-700k](https://huggingface.co/yyqoni/Phi-3-mini-4k-instruct-segment-rm-700k)
## Method
The segment reward model assigns rewards to semantically meaningful text segments, segmented dynamically with an entropy-based threshold. It is trained on binary preference labels from human feedback, optimizing a Bradley-Terry loss function that aggregates segment rewards using the average function.
## Architecture
<div align=center>
![image/png](https://cdn-uploads.huggingface.co/production/uploads/605e8dfd5abeb13e714c4c18/xeGwtrpnx2bWFg5ZOHA7R.png)
</div>
## Training
The phi-instruct-segment model is fine-tuned from **microsoft/Phi-3-mini-4k-instruct** on the **hendrydong/preference_700K dataset**.
## Citation
If you find this model or our research useful, please consider citing our paper:
```bibtex
@misc{yin2025segmentingtextlearningrewards,
title={Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model},
author={Yueqin Yin and Shentao Yang and Yujia Xie and Ziyi Yang and Yuting Sun and Hany Awadalla and Weizhu Chen and Mingyuan Zhou},
year={2025},
eprint={2501.02790},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.02790},
}
``` |