Weighted-Reward Preference Optimization for Implicit Model Fusion Paper • 2412.03187 • Published Dec 4, 2024 • 9
CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis Paper • 2408.14765 • Published Aug 27, 2024 • 15
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models Paper • 2410.09732 • Published Oct 13, 2024 • 54
ChartThinker: A Contextual Chain-of-Thought Approach to Optimized Chart Summarization Paper • 2403.11236 • Published Mar 17, 2024 • 1
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance Paper • 2407.06937 • Published Jul 9, 2024 • 1
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Paper • 2405.08748 • Published May 14, 2024 • 19
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation Paper • 2403.08857 • Published Mar 13, 2024 • 3
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining Paper • 2303.02489 • Published Mar 4, 2023
TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph Paper • 2205.14307 • Published May 28, 2022