yueqin yin's picture

3 1 1

yueqin yin

yyqoni

AI & ML interests

None yet

Recent Activity

updated a collection 8 days ago

DenseRewardRLHF-PPO

updated a model 8 days ago

yyqoni/Phi-3-mini-4k-bandit-ppo-60k

upvoted a paper 9 days ago

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

View all activity

Organizations

yyqoni's activity

commented a paper 10 days ago

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Paper • 2501.02790 • Published 12 days ago • 9 •

New activity in nvidia/HelpSteer2 7 months ago

Averaging GT Overall Scores in Bradley-Terry Model with HelpSteer2

#3 opened 7 months ago by

New activity in wandb/mistral-7b-zephyr-dpo 10 months ago

The format of chat template

#2 opened 10 months ago by

The format of chat template

#2 opened 10 months ago by