LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations Paper • 2412.08580 • Published 27 days ago • 45
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 59