Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding Paper • 2312.00081 • Published Nov 30, 2023 • 2
Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning Paper • 2412.03565 • Published Dec 4, 2024 • 11
RelBench: A Benchmark for Deep Learning on Relational Databases Paper • 2407.20060 • Published Jul 29, 2024 • 7
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model Paper • 2407.16982 • Published Jul 24, 2024 • 41
Understanding Reference Policies in Direct Preference Optimization Paper • 2407.13709 • Published Jul 18, 2024 • 16
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Paper • 2407.07053 • Published Jul 9, 2024 • 43
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15, 2024 • 171
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published Jul 1, 2024 • 75
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale Paper • 2407.05282 • Published Jul 7, 2024 • 13
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning Paper • 2406.12742 • Published Jun 18, 2024 • 14
Improving Visual Commonsense in Language Models via Multiple Image Generation Paper • 2406.13621 • Published Jun 19, 2024 • 13
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published Jun 19, 2024 • 16
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Paper • 2406.14544 • Published Jun 20, 2024 • 35
MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Paper • 2406.15252 • Published Jun 21, 2024 • 14
On the Transformations across Reward Model, Parameter Update, and In-Context Prompt Paper • 2406.16377 • Published Jun 24, 2024 • 12
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning Paper • 2406.09170 • Published Jun 13, 2024 • 24