DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
Ziyang Song*,
Zerong Wang*,
Bo Li,
Hao Zhang,
Ruijie Zhu,
Li Liu,
Peng-Tao Jiang†,
Tianzhu Zhang†,
*Equal Contribution, †Corresponding Author
University of Science and Technology of China, vivo Mobile Communication Co., Ltd.
Arxiv 2025
![teaser](assets/framework.png)
>We present DepthMaster, a tamed single-step diffusion model that customizes generative features in diffusion models to suit the discriminative depth estimation task. We introduce a Feature Alignment module to mitigate overfitting to texture and a Fourier Enhancement module to refine fine-grained details. DepthMaster exhibits state-of-the-art zero-shot performance and superior detail preservation ability, surpassing
other diffusion-based methods across various datasets.
## 🎓 Citation
Please cite our paper:
```bibtex
@article{song2025depthmaster,
title={DepthMaster: Taming Diffusion Models for Monocular Depth Estimation},
author={Song, Ziyang and Wang, Zerong and Li, Bo and Zhang, Hao and Zhu, Ruijie and Liu, Li and Jiang, Peng-Tao and Zhang, Tianzhu},
journal={arXiv preprint arXiv:2501.02576},
year={2025}
}
```
## Acknowledgements
The code is based on [Marigold](https://github.com/prs-eth/Marigold).
## 🎫 License
This work is licensed under the Apache License, Version 2.0 (as defined in the [LICENSE](LICENSE.txt)).
By downloading and using the code and model you agree to the terms in the [LICENSsE](LICENSE.txt).
[![License](https://img.shields.io/badge/License-Apache--2.0-929292)](https://www.apache.org/licenses/LICENSE-2.0)