Multimodal Preference Data Synthetic Alignment with Reward Model Paper • 2412.17417 • Published 26 days ago • 1
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4, 2024 • 72
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods Jan 18, 2024 • 42