Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper โข 2501.01423 โข Published 2 days ago โข 31
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 16 days ago โข 16
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 16 days ago โข 16
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 16 days ago โข 16 โข 2
Byte Latent Transformer: Patches Scale Better Than Tokens Paper โข 2412.09871 โข Published 23 days ago โข 81
Putting the Object Back into Video Object Segmentation Paper โข 2310.12982 โข Published Oct 19, 2023