DepthMatch: Semi-Supervised RGB-D Scene Parsing through Depth-Guided Regularization

2025-05-26Unverified0· sign in to hype

Jianxin Huang, Jiahang Li, Sergey Vityazev, Alexander Dvorkovich, Rui Fan

Unverified — Be the first to reproduce this paper.

Abstract

RGB-D scene parsing methods effectively capture both semantic and geometric features of the environment, demonstrating great potential under challenging conditions such as extreme weather and low lighting. However, existing RGB-D scene parsing methods predominantly rely on supervised training strategies, which require a large amount of manually annotated pixel-level labels that are both time-consuming and costly. To overcome these limitations, we introduce DepthMatch, a semi-supervised learning framework that is specifically designed for RGB-D scene parsing. To make full use of unlabeled data, we propose complementary patch mix-up augmentation to explore the latent relationships between texture and spatial features in RGB-D image pairs. We also design a lightweight spatial prior injector to replace traditional complex fusion modules, improving the efficiency of heterogeneous feature fusion. Furthermore, we introduce depth-guided boundary loss to enhance the model's boundary prediction capabilities. Experimental results demonstrate that DepthMatch exhibits high applicability in both indoor and outdoor scenes, achieving state-of-the-art results on the NYUv2 dataset and ranking first on the KITTI Semantics benchmark.

Tasks

Scene Parsing Semantic Segmentation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
NYU-Depth V2	DepthMatch (DINOv2-S)	Mean IoU	61.4	—	Unverified

DepthMatch: Semi-Supervised RGB-D Scene Parsing through Depth-Guided Regularization

Abstract

Tasks

Benchmark Results

Reproductions