MonoScene: Monocular 3D Semantic Scene Completion

2021-12-01CVPR 2022Code Available1· sign in to hype

Anh-Quan Cao, Raoul de Charette

Code Available — Be the first to reproduce this paper.

Code

github.com/cv-rits/MonoScene
OfficialIn paperpytorch★ 800
github.com/astra-vision/monoscene
pytorch★ 799

Abstract

MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets bridged by a novel 2D-3D features projection inspiring from optics and introduces a 3D context relation prior to enforce spatio-semantic consistency. Along with architectural contributions, we introduce novel global scene and local frustums losses. Experiments show we outperform the literature on all metrics and datasets while hallucinating plausible scenery even beyond the camera field of view. Our code and trained models are available at https://github.com/cv-rits/MonoScene.

Tasks

3D Reconstruction 3D Scene Reconstruction 3D Semantic Scene Completion 3D Semantic Scene Completion from a single RGB image Single-View 3D Reconstruction

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
KITTI-360	MonoScene	mIoU	12.31	—	Unverified
NYUv2	MonoScene (RGB input only)	mIoU	26.94	—	Unverified
SemanticKITTI	MonoScene (RGB input only)	mIoU	11.08	—	Unverified

MonoScene: Monocular 3D Semantic Scene Completion

Code

Abstract

Tasks

Benchmark Results

Reproductions