SOTAVerified

Survey on Monocular Metric Depth Estimation

2025-01-21Unverified0· sign in to hype

Jiuling Zhang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Monocular Depth Estimation (MDE) is fundamental to computer vision, enabling spatial understanding, 3D reconstruction, and autonomous driving. Deep learning-based MDE predicts relative depth from a single image, but the lack of metric scale introduces inconsistencies, limiting applicability in tasks such as visual SLAM, 3D reconstruction, and novel view synthesis. Monocular Metric Depth Estimation (MMDE) overcomes this limitation by enabling precise scene-scale inference, improving depth consistency, enhancing stability in sequential tasks, and streamlining integration into practical systems. This paper systematically reviews the evolution of depth estimation, from traditional geometric methods to deep learning breakthroughs, emphasizing scale-agnostic approaches in zero-shot generalization which is crucial for advancing MMDE. Recent progress in zero-shot MMDE is examined, focusing on challenges such as model generalization and boundary detail loss. To address these issues, researchers have explored unlabeled data augmentation, image patching, architectural optimization, and generative techniques. This review analyzes these developments, assessing their impact and limitations. Key findings are synthesized, unresolved challenges outlined, and future research direction proposal. By providing a clear technical roadmap and insight into emerging trends, this work aims to drive innovation and expand the real-world applications of MMDE.

Tasks

Reproductions