SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
Abstract
Video restoration poses non-trivial challenges in maintaining fidelity while recovering temporally consistent details from unknown degradations in the wild. Despite recent advances in diffusion-based restoration, these methods often face limitations in generation capability and sampling efficiency. In this work, we present SeedVR, a diffusion transformer designed to handle real-world video restoration with arbitrary length and resolution. The core design of SeedVR lies in the shifted window attention that facilitates effective restoration on long video sequences. SeedVR further supports variable-sized windows near the boundary of both spatial and temporal dimensions, overcoming the resolution constraints of traditional window attention. Equipped with contemporary practices, including causal video autoencoder, mixed image and video training, and progressive training, SeedVR achieves highly-competitive performance on both synthetic and real-world benchmarks, as well as AI-generated videos. Extensive experiments demonstrate SeedVR's superiority over existing methods for generic video restoration.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis (2024)
- Consistent Human Image and Video Generation with Spatially Conditioned Diffusion (2024)
- RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution (2024)
- MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control (2024)
- Optical-Flow Guided Prompt Optimization for Coherent Video Generation (2024)
- Reversing the Damage: A QP-Aware Transformer-Diffusion Approach for 8K Video Restoration under Codec Compression (2024)
- Large Motion Video Autoencoding with Cross-modal Video VAE (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper