Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
Base model