yyqoni
/

meta-llama-3.1-instruct-8b-segment-rm-700k

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

meta-llama-3.1-instruct-8b-segment-rm-700k / README.md

yyqoni's picture

Update README.md

802a073 verified 11 days ago

|

history blame contribute delete

1.65 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- hendrydong/preference_700K
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	pipeline_tag: text-classification
	---


	# rlhflow-llama-3-sft-segment Model Card

	- Paper: [Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
	](https://arxiv.org/abs/2501.02790)

	- Model: [yyqoni/meta-llama-3.1-instruct-8b-segment-rm-700k](https://huggingface.co/yyqoni/meta-llama-3.1-instruct-8b-segment-rm-700k)

	## Method


	The segment reward model assigns rewards to semantically meaningful text segments, segmented dynamically with an entropy-based threshold. It is trained on binary preference labels from human feedback, optimizing a Bradley-Terry loss function that aggregates segment rewards using the average function.

	## Architecture
	<div align=center>

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/605e8dfd5abeb13e714c4c18/xeGwtrpnx2bWFg5ZOHA7R.png)

	</div>


	## Training

	The phi-instruct-segment model is fine-tuned from meta-llama/Llama-3.1-8B-Instruct on the hendrydong/preference_700K dataset.



	## Citation

	If you find this model or our research useful, please consider citing our paper:

	```bibtex
	@misc{yin2025segmentingtextlearningrewards,
	title={Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model},
	author={Yueqin Yin and Shentao Yang and Yujia Xie and Ziyi Yang and Yuting Sun and Hany Awadalla and Weizhu Chen and Mingyuan Zhou},
	year={2025},
	eprint={2501.02790},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2501.02790},
	}
	```