Perfecting Depth: Uncertainty-Aware Enhancement of Metric Depth
Jinyoung Jun 1, Lei Chu 2, Jiahao Li 2, Yan Lu 2, Chang-Su Kim 1
1Korea University, 2Microsoft Research Asia
 
Paper teaser
Sensor depth enhancement result on DIODE-Indoor. Without hand-crafted priors for noise or artifacts, our method effectively removes unreliable pixels and improves overall depth quality. Top-right inset indicates zero values in initial depth.
Abstract
We propose a novel two-stage framework for sensor depth enhancement, called Perfecting Depth. This framework leverages the stochastic nature of diffusion models to automatically detect unreliable depth regions while preserving geometric cues. In the first stage (stochastic estimation), the method identifies unreliable measurements and infers geometric structure by leveraging a training–inference domain gap. In the second stage (deterministic refinement), it enforces structural consistency and pixel-level accuracy using the uncertainty map derived from the first stage. By combining stochastic uncertainty modeling with deterministic refinement, our method yields dense, artifact-free depth maps with improved reliability. Experimental results demonstrate its effectiveness across diverse real-world scenarios. Furthermore, theoretical analysis, various experiments, and qualitative visualizations validate its robustness and scalability. Our framework sets a new baseline for sensor depth enhancement, with potential applications in autonomous driving, robotics, and immersive technologies.
  Paper [PDF]
Code and data [Coming Soon]
Citation
@misc{jun2025perfectingdepthuncertaintyawareenhancement,
      title={Perfecting Depth: Uncertainty-Aware Enhancement of Metric Depth}, 
      author={Jinyoung Jun and Lei Chu and Jiahao Li and Yan Lu and Chang-Su Kim},
      year={2025},
      eprint={2506.04612},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.04612}, 
}
 
Algorithm pipeline
Our framework consists of a two-stage pipeline: (1) Stochastic estimation stage: We employ a diffusion-based strategy that trains the diffusion model on clean data but performs inference on raw, real-world data. This deliberate training-inference gap enables the model to measure per-pixel uncertainty and geometric cues, effectively identifying unreliable pixels without requiring handcrafted priors. (2) Deterministic refinement stage: A refinement network is then applied to focus on uncertain regions, ensuring accurate corrections while preserving valid sensor measurements. By integrating the complementary strengths of stochastic sampling for global uncertainty estimation and precise local refinement for targeted corrections, our pipeline generates dense and high-quality depth maps that are well-suited for real-world tasks demanding accurate metric depth.
 
Results
(a) Sensor depth enhancement result on DIODE-Indoor. (b) Qualitative comparison of fine-tuning results on NYUv2. For each depth map in (b), the corresponding error map is provided below, in which brighter pixels represent large errors.
 
Fine-tuning performance of relative depth estimators using DIODE-Indoor.
 
More sensor depth enhancement result of the proposed framework on DIODE-Indoor,
 
Ethics statement
This work is conducted as part of a research project. While we plan to share the code and findings to promote transparency and reproducibility in research, we currently have no plans to incorporate this work into a commercial product. In all aspects of this research, we are committed to adhering to Microsoft AI principles, including fairness, transparency, and accountability.
 
 
 
©Lei Chu. Last update: 3 Jun, 2025.