SOTAVerified

Depth Estimation

Depth Estimation is the task of measuring the distance of each pixel relative to the camera. Depth is extracted from either monocular (single) or stereo (multiple views of a scene) images. Traditional methods use multi-view geometry to find the relationship between the images. Newer methods can directly estimate depth by minimizing the regression loss, or by learning to generate a novel view from a sequence. The most popular benchmarks are KITTI and NYUv2. Models are typically evaluated according to a RMS metric.

Source: DIODE: A Dense Indoor and Outdoor DEpth Dataset

Papers

Showing 150 of 2454 papers

TitleStatusHype
π^3: Scalable Permutation-Equivariant Visual Geometry Learning0
S^2M^2: Scalable Stereo Matching Model for Reliable Depth Estimation0
Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios0
Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth EstimationCode0
MonoMVSNet: Monocular Priors Guided Multi-View Stereo NetworkCode1
Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation0
Cameras as Relative Positional Encoding0
ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way0
LighthouseGS: Indoor Structure-aware 3D Gaussian Splatting for Panorama-Style Mobile Captures0
Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation0
VOTE: Vision-Language-Action Optimization with Trajectory Ensemble VotingCode1
From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images0
RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather0
Underwater Monocular Metric Depth Estimation: Real-World Benchmarks and Synthetic Fine-Tuning0
RoboScape: Physics-informed Embodied World ModelCode0
ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation0
THIRDEYE: Cue-Aware Monocular Depth Estimation via Brain-Inspired Multi-Stage Fusion0
StereoDiff: Stereo-Diffusion Synergy for Video Depth Estimation0
Look to Locate: Vision-Based Multisensory Navigation with 3-D Digital Maps for GNSS-Challenged Environments0
BulletGen: Improving 4D Reconstruction with Bullet-Time Generation0
Monocular One-Shot Metric-Depth Alignment for RGB-Based Robot Grasping0
DreamCube: 3D Panorama Generation via Multi-plane Synchronization0
EndoMUST: Monocular Depth Estimation for Robotic Endoscopy via End-to-end Multi-step Self-supervised TrainingCode1
RaCalNet: Radar Calibration Network for Sparse-Supervised Metric Depth Estimation0
DiFuse-Net: RGB and Dual-Pixel Depth Estimation using Window Bi-directional Parallax Attention and Cross-modal Transfer Learning0
TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Scale-Oriented ContrastCode1
Test3R: Learning to Reconstruct 3D at Test TimeCode2
Self-Supervised Enhancement for Depth from a Lightweight ToF Sensor with Monocular ImagesCode1
Leveraging 6DoF Pose Foundation Models For Mapping Marine Sediment BurialCode0
DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects0
EgoM2P: Egocentric Multimodal Multitask Pretraining0
Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images0
Hidden in plain sight: VLMs overlook their visual representations0
Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence0
Dark Channel-Assisted Depth-from-Defocus from a Single Image0
Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration0
Aerial Multi-View Stereo via Adaptive Depth Range Inference and Normal Cues0
Structure-Aware Radar-Camera Depth Estimation0
Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation0
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation0
Attacking Attention of Foundation Models Disrupts Downstream TasksCode0
Harnessing Foundation Models for Robust and Generalizable 6-DOF Bronchoscopy Localization0
Ultrafast High-Flux Single-Photon LiDAR Simulator via Neural Mapping0
Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation0
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion0
Spatial RoboGrasp: Generalized Robotic Grasping Control Policy0
SpikeStereoNet: A Brain-Inspired Framework for Stereo Depth Estimation from Spike Streams0
From Single Images to Motion Policies via Video-Generation Environment Representations0
EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media0
BadDepth: Backdoor Attacks Against Monocular Depth Estimation in the Physical World0
Show:102550
← PrevPage 1 of 50Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1OmniDepthRMSE0.62Unverified
2SphereDepthRMSE0.45Unverified
3Jin et al.RMSE0.42Unverified
4BiFuse with fusionRMSE0.41Unverified
5HoHoNet (ResNet-101)RMSE0.38Unverified
6PanoDepthRMSE0.37Unverified
7BiFuse++RMSE0.37Unverified
8UniFuse with fusionRMSE0.37Unverified
9DisConvRMSE0.37Unverified
10SliceNetRMSE0.37Unverified
#ModelMetricClaimedVerifiedStatus
1A2JmAP8.61Unverified
2PAD-NetRMS0.79Unverified
3MS-CRFRMS0.59Unverified
4DORNRMS0.51Unverified
5FreeformRMS0.43Unverified
6Optimized, freeformRMS0.43Unverified
7VNLRMS0.42Unverified
8BTSRMS0.41Unverified
9TransDepth (AGD+ ViT)RMS0.37Unverified
10AdaBinsRMS0.36Unverified
#ModelMetricClaimedVerifiedStatus
1T2NetAbs Rel0.35Unverified
2MIDASAbs Rel0.31Unverified
3Bhattacharjee et al.Abs Rel0.25Unverified
#ModelMetricClaimedVerifiedStatus
1T2NetAbs Rel0.49Unverified
2MIDASAbs Rel0.42Unverified
3Bhattacharjee et al.Abs Rel0.38Unverified
#ModelMetricClaimedVerifiedStatus
1LeReSabsolute relative error0.1Unverified
2DELTASabsolute relative error0.09Unverified
3Distill Any Depthabsolute relative error0.04Unverified
#ModelMetricClaimedVerifiedStatus
1SDC-DepthRMSE6.92Unverified
2SwinMTLRMSE6.35Unverified
#ModelMetricClaimedVerifiedStatus
1AIP-BrownDelta < 1.250.36Unverified
2LeResDelta < 1.250.23Unverified
#ModelMetricClaimedVerifiedStatus
1H-Net (Ours)Absolute relative error (AbsRel)0.09Unverified
2H-Net (Ours) Full EigenAbsolute relative error (AbsRel)0.08Unverified
#ModelMetricClaimedVerifiedStatus
1GLPDepthDelta < 1.250.43Unverified
2SRDINET (Model A)Delta < 1.250.4Unverified
#ModelMetricClaimedVerifiedStatus
1Atlas (finetuned)RMSE0.17Unverified
2Atlas (plain)RMSE0.17Unverified
#ModelMetricClaimedVerifiedStatus
1LFattNetBadPix(0.01)17.23Unverified
#ModelMetricClaimedVerifiedStatus
1LightDepthNumber of parameters (M)42.6Unverified
#ModelMetricClaimedVerifiedStatus
1UniFuseAbs Rel0.11Unverified
#ModelMetricClaimedVerifiedStatus
1X-TC (Cross-Task Consistency)L1 error1.63Unverified