-
Analyzing The Language of Visual Tokens
Paper • 2411.05001 • Published • 23 -
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
Paper • 2411.14982 • Published • 16 -
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration
Paper • 2411.17686 • Published • 19
Jaehyun Jun
btjhjeon
AI & ML interests
Multimodal
Recent Activity
updated
a collection
about 10 hours ago
Multimodal LLM
updated
a collection
about 10 hours ago
Multimodal Alignment
upvoted
a
paper
about 11 hours ago
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM
Organizations
Collections
8
-
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Paper • 2410.17637 • Published • 34 -
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Paper • 2411.10442 • Published • 71 -
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Paper • 2411.18203 • Published • 33 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 22
models
None public yet
datasets
None public yet