OneFormer3D: One Transformer for Unified Point Cloud Segmentation

2023-11-24CVPR 2024Code Available2· sign in to hype

Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich

Code Available — Be the first to reproduce this paper.

Code

github.com/oneformer3d/oneformer3d
pytorch★ 593

Abstract

Semantic, instance, and panoptic segmentation of 3D point clouds have been addressed using task-specific models of distinct design. Thereby, the similarity of all segmentation tasks and the implicit relationship between them have not been utilized effectively. This paper presents a unified, simple, and effective model addressing all these tasks jointly. The model, named OneFormer3D, performs instance and semantic segmentation consistently, using a group of learnable kernels, where each kernel is responsible for generating a mask for either an instance or a semantic category. These kernels are trained with a transformer-based decoder with unified instance and semantic queries passed as an input. Such a design enables training a model end-to-end in a single run, so that it achieves top performance on all three segmentation tasks simultaneously. Specifically, our OneFormer3D ranks 1st and sets a new state-of-the-art (+2.1 mAP50) in the ScanNet test leaderboard. We also demonstrate the state-of-the-art results in semantic, instance, and panoptic segmentation of ScanNet (+21 PQ), ScanNet200 (+3.8 mAP50), and S3DIS (+0.8 mIoU) datasets.

Tasks

3D Instance Segmentation 3D Object Detection 3D Semantic Segmentation Decoder Panoptic Segmentation Point Cloud Segmentation Segmentation Semantic Segmentation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
S3DIS	OneFormer3D	AP@50	75.8	—	Unverified
ScanNetV2	OneFromer3D	mAP @ 50	80.1	—	Unverified

OneFormer3D: One Transformer for Unified Point Cloud Segmentation

Code

Abstract

Tasks

Benchmark Results

Reproductions