MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training Paper • 2501.07556 • Published 5 days ago • 5
FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation Paper • 2412.00671 • Published Dec 1, 2024 • 1
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published about 1 month ago • 18
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation Paper • 2501.02576 • Published 13 days ago • 15