Panoptic Segmentation

Panoptic Segmentation is a computer vision task that combines semantic segmentation and instance segmentation to provide a comprehensive understanding of the scene. The goal of panoptic segmentation is to segment the image into semantically meaningful parts or regions, while also detecting and distinguishing individual instances of objects within those regions. In a given image, every pixel is assigned a semantic label, and pixels belonging to "things" classes (countable objects with instances, like cars and people) are assigned unique instance IDs. ( Image credit: Detectron2 )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 462 papers

Title	Date	Tasks	Status	Hype	Score
OMG-Seg: Is One Model Good Enough For All Segmentation?	Jan 18, 2024	AllDecoder	CodeCode Available	5	5
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications	Jun 25, 2023	CPUDecoder	CodeCode Available	5	5
Visual Attention Network	Feb 20, 2022	image-classificationImage Classification	CodeCode Available	4	5
Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering	Jan 12, 2024	3D Panoptic Segmentation3D Semantic Segmentation	CodeCode Available	4	5
SegGPT: Segmenting Everything In Context	Apr 6, 2023	Few-Shot Semantic SegmentationIn-Context Learning	CodeCode Available	4	5
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation	Jun 6, 2022	Image SegmentationInstance Segmentation	CodeCode Available	4	5
Detectron2 Object Detection & Manipulating Images using Cartoonization	Aug 1, 2021	Autonomous VehiclesData Visualization	CodeCode Available	4	5
Panoptic Feature Pyramid Networks	Jan 8, 2019	Instance SegmentationPanoptic Segmentation	CodeCode Available	4	5
4D Panoptic Scene Graph Generation	May 16, 2024	4D Panoptic SegmentationGraph Generation	CodeCode Available	3	5
A Simple Framework for Open-Vocabulary Segmentation and Detection	Mar 14, 2023	Instance SegmentationPanoptic Segmentation	CodeCode Available	3	5
Generalized Decoding for Pixel, Image, and Language	Dec 21, 2022	DecoderImage Segmentation	CodeCode Available	3	5
ResNeSt: Split-Attention Networks	Apr 19, 2020	image-classificationImage Classification	CodeCode Available	3	5
Tracking Anything with Decoupled Video Segmentation	Sep 7, 2023	Open-Vocabulary Video SegmentationOpen-World Video Segmentation	CodeCode Available	3	5
Vision Transformer Adapter for Dense Predictions	May 17, 2022	Instance SegmentationObject Detection	CodeCode Available	3	5
OneFormer: One Transformer to Rule Universal Image Segmentation	Nov 10, 2022	Instance SegmentationPanoptic Segmentation	CodeCode Available	3	5
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model	Mar 21, 2024	DecoderGeneralized Referring Expression Segmentation	CodeCode Available	3	5
RAP-SAM: Towards Real-Time All-Purpose Segment Anything	Jan 18, 2024	AllDecoder	CodeCode Available	3	5
Aligning and Prompting Everything All at Once for Universal Visual Perception	Dec 4, 2023	AllObject	CodeCode Available	2	5
DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution	Jun 3, 2020	Instance SegmentationObject	CodeCode Available	2	5
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration	Jan 23, 2024	3D Semantic SegmentationAutonomous Driving	CodeCode Available	2	5
PVO: Panoptic Visual Odometry	Jul 4, 2022	Camera Pose EstimationOptical Flow Estimation	CodeCode Available	2	5
PosSAM: Panoptic Open-vocabulary Segment Anything	Mar 14, 2024	DecoderOpen Vocabulary Panoptic Segmentation	CodeCode Available	2	5
PEM: Prototype-based Efficient MaskFormer for Image Segmentation	Feb 29, 2024	Image SegmentationPanoptic Segmentation	CodeCode Available	2	5
Per-Pixel Classification is Not All You Need for Semantic Segmentation	Jul 13, 2021	AllClassification	CodeCode Available	2	5
Scalable SoftGroup for 3D Instance Segmentation on Point Clouds	Sep 17, 2022	3D Instance SegmentationInstance Segmentation	CodeCode Available	2	5
Open-World Entity Segmentation	Jul 29, 2021	Image ManipulationImage Segmentation	CodeCode Available	2	5
CellViT: Vision Transformers for Precise Cell Segmentation and Classification	Jun 27, 2023	Cell DetectionCell Segmentation	CodeCode Available	2	5
SAD: Segment Any RGBD	May 23, 2023	3D Panoptic SegmentationOpen Vocabulary Semantic Segmentation	CodeCode Available	2	5
OneFormer3D: One Transformer for Unified Point Cloud Segmentation	Nov 24, 2023	3D Instance Segmentation3D Object Detection	CodeCode Available	2	5
Scene-Centric Unsupervised Panoptic Segmentation	Apr 2, 2025	Instance SegmentationPanoptic Segmentation	CodeCode Available	2	5
Masked-attention Mask Transformer for Universal Image Segmentation	Dec 2, 2021	2D Semantic SegmentationImage Segmentation	CodeCode Available	2	5
LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network	Mar 14, 2022	4D Panoptic SegmentationAutonomous Driving	CodeCode Available	2	5
Mask2Former for Video Instance Segmentation	Dec 20, 2021	Image SegmentationInstance Segmentation	CodeCode Available	2	5
Panoptic Lifting for 3D Scene Understanding with Neural Fields	Dec 19, 2022	2D Panoptic SegmentationPanoptic Segmentation	CodeCode Available	2	5
Hierarchical Open-vocabulary Universal Image Segmentation	Jul 3, 2023	Image ComprehensionImage Segmentation	CodeCode Available	2	5
Better Call SAL: Towards Learning to Segment Anything in Lidar	Mar 19, 2024	Panoptic SegmentationSegmentation	CodeCode Available	2	5
HyperSeg: Towards Universal Visual Segmentation with Large Language Model	Nov 26, 2024	Language ModelingLarge Language Model	CodeCode Available	2	5
Focal Modulation Networks	Mar 22, 2022	image-classificationImage Classification	CodeCode Available	2	5
Hierarchical Multi-Scale Attention for Semantic Segmentation	May 21, 2020	Panoptic SegmentationSemantic Segmentation	CodeCode Available	2	5
Image Segmentation in Foundation Model Era: A Survey	Aug 23, 2024	Image SegmentationInstance Segmentation	CodeCode Available	2	5
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting	Jan 18, 2024	Instance SegmentationInteractive Segmentation	CodeCode Available	2	5
BatchFormerV2: Exploring Sample Relationships for Dense Representation Learning	Apr 4, 2022	image-classificationImage Classification	CodeCode Available	2	5
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation	Mar 17, 2020	image-classificationImage Classification	CodeCode Available	2	5
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future	Jul 18, 2023	Knowledge Distillationobject-detection	CodeCode Available	2	5
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning	Mar 29, 2024	Continual LearningContinual Panoptic Segmentation	CodeCode Available	2	5
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models	Mar 8, 2023	Open Vocabulary Panoptic SegmentationOpen Vocabulary Semantic Segmentation	CodeCode Available	2	5
Context-Aware Video Instance Segmentation	Jul 3, 2024	Instance SegmentationPanoptic Segmentation	CodeCode Available	2	5
1st Place Solution for PSG competition with ECCV'22 SenseHuman Workshop	Feb 6, 2023	Multi-class ClassificationPanoptic Segmentation	CodeCode Available	2	5
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction	Oct 2, 2023	image-classificationImage Classification	CodeCode Available	2	5
Dilated Neighborhood Attention Transformer	Sep 29, 2022	Image ClassificationInstance Segmentation	CodeCode Available	2	5

Show:10 25 50

← PrevPage 1 of 10Next →

All datasets COCO test-dev Cityscapes val COCO minival ADE20K val Mapillary val Cityscapes test LaRS S3DIS Area5 ScanNetV2 Indian Driving Dataset KITTI Panoptic Segmentation PanNuke

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Mask DINO (single scale)	PQ	59.5	—	Unverified
2	kMaX-DeepLab (single-scale)	PQ	58.5	—	Unverified
3	Mask2Former (Swin-L)	PQ	58.3	—	Unverified
4	Panoptic SegFormer (Swin-L)	PQ	56.2	—	Unverified
5	Panoptic SegFormer (PVTv2-B5)	PQ	55.8	—	Unverified
6	CMT-DeepLab (single-scale)	PQ	55.7	—	Unverified
7	K-Net (Swin-L)	PQ	55.2	—	Unverified
8	MaskConver (ResNet50, single-scale)	PQ	53.6	—	Unverified
9	MaskFormer (Swin-L)	PQ	53.3	—	Unverified
10	Panoptic FCN* (Swin-L)	PQ	52.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ViT-P (OneFormer, InternImage-H)	PQ	70.8	—	Unverified
2	Panoptic FCN* (Swin-L, Cityscapes-fine)	PQst	70.6	—	Unverified
3	OneFormer (ConvNeXt-L, single-scale, 512x1024, Mapillary Vistas-pretrained)	PQ	70.1	—	Unverified
4	Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, multi-scale)	PQ	69.6	—	Unverified
5	OneFormer (ConvNeXt-L, single-scale)	PQ	68.51	—	Unverified
6	Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, single-scale)	PQ	68.5	—	Unverified
7	Axial-DeepLab-XL (Mapillary Vistas, multi-scale)	PQ	68.5	—	Unverified
8	kMaX-DeepLab (single-scale)	PQ	68.4	—	Unverified
9	OneFormer (ConvNeXt-XL, single-scale)	PQ	68.4	—	Unverified
10	AFF-Base (single-scale, point-based Mask2Former)	PQ	67.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HyperSeg (Swin-B)	PQ	61.2	—	Unverified
2	OneFormer (InternImage-H,single-scale)	PQ	60	—	Unverified
3	OpenSeeD (SwinL, single-scale)	PQ	59.5	—	Unverified
4	UMG-CLIP-E/14	PQ	59.5	—	Unverified
5	MasK DINO (SwinL,single-scale)	PQ	59.4	—	Unverified
6	EoMT (DINOv2-g, single-scale, 1280x1280)	PQ	59.2	—	Unverified
7	UMG-CLIP-L/14	PQ	58.9	—	Unverified
8	Panoptic FCN* (Swin-L, single-scale)	PQth	58.5	—	Unverified
9	DiNAT-L (single-scale, Mask2Former)	PQ	58.5	—	Unverified
10	ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former)	PQ	58.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896)	PQ	54.5	—	Unverified
2	ViT-P (OneFormer, DiNAT-L, single-scale, 1280x1280, COCO_pretrain)	PQ	54	—	Unverified
3	OpenSeed(SwinL, single scale, 1280x1280)	PQ	53.7	—	Unverified
4	OneFormer (DiNAT-L, single-scale, 1280x1280, COCO-Pretrain)	PQ	53.4	—	Unverified
5	EoMT (DINOv2-g, single-scale, 1280x1280, COCO pre-trained)	PQ	52.8	—	Unverified
6	X-Decoder (Davit-d5, Deform, single-scale, 1280x1280)	PQ	52.4	—	Unverified
7	ViT-P (OneFormer, DiNAT-L, single-scale, 1280x1280)	PQ	51.9	—	Unverified
8	OneFormer (DiNAT-L, single-scale, 1280x1280)	PQ	51.5	—	Unverified
9	OneFormer (Swin-L, single-scale, 1280x1280)	PQ	51.4	—	Unverified
10	kMaX-DeepLab (ConvNeXt-L, single-scale, 1281x1281)	PQ	50.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	OneFormer (DiNAT-L, single-scale)	PQ	46.7	—	Unverified
2	OneFormer (ConvNeXt-L, single-scale)	PQ	46.4	—	Unverified
3	Panoptic FCN* (Swin-L, single-scale)	PQ	45.7	—	Unverified
4	Panoptic-DeepLab (SWideRNet-(1, 1, 4.5), multi-scale)	PQ	44.8	—	Unverified
5	Panoptic FCN* (ResNet-50-FPN)	PQst	42.3	—	Unverified
6	Mask2Former + Intra-Batch Supervision (ResNet-50)	PQ	42.2	—	Unverified
7	Axial-DeepLab-L (multi-scale)	PQ	41.1	—	Unverified
8	EfficientPS	PQ	40.6	—	Unverified
9	Panoptic-DeepLab (X71)	PQ	40.5	—	Unverified
10	AdaptIS (ResNeXt-101)	PQ	40.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	OneFormer (ConvNeXt-L, single-scale, Mapillary Vistas-Pretrained)	PQ	68	—	Unverified
2	Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary, multi-scale)	PQ	67.8	—	Unverified
3	EfficientPS	PQ	67.1	—	Unverified
4	Axial-DeepLab-XL (Mapillary Vistas, multi-scale)	PQ	66.6	—	Unverified
5	kMaX-DeepLab (single-scale)	PQ	66.2	—	Unverified
6	Panoptic-Deeplab	PQ	65.5	—	Unverified
7	EfficientPS (Cityscapes-fine)	PQ	62.9	—	Unverified
8	COPS (ResNet-50)	PQ	60	—	Unverified
9	SOGNet (ResNet-50)	PQ	60	—	Unverified
10	Dynamically Instantiated Network	PQ	55.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Mask2Former (Swin-B)	PQ	41.7	—	Unverified
2	Panoptic FPN (ResNet-50)	PQ	40.1	—	Unverified
3	Mask2Former (Swin-T)	PQ	39.2	—	Unverified
4	Panoptic FPN (ResNet-101)	PQ	38.7	—	Unverified
5	Mask2Former (ResNet-50)	PQ	37.6	—	Unverified
6	Mask2Former (ResNet-101)	PQ	37.2	—	Unverified
7	Panoptic Deeplab (ResNet-50)	PQ	34.7	—	Unverified
8	MaX-DeepLab	PQ	31.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SuperCluster	PQ	50.1	—	Unverified
2	PointGroup (Xiang 2023)	PQ	42.3	—	Unverified
3	KPConv (Xiang 2023)	PQ	41.8	—	Unverified
4	MinkowskiNet (Xiang 2023)	PQ	39.2	—	Unverified
5	PointNet++ (Xiang 2023)	PQ	24.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	OneFormer3D	PQ	71.2	—	Unverified
2	PanopticNDT (10cm)	PQ	59.19	—	Unverified
3	SuperCluster	PQ	58.7	—	Unverified
4	PanopticFusion (with CRF)	PQ	33.5	—	Unverified
5	SceneGraphFusion (NN mapping)	PQ	31.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	EfficientPS	PQ	51.1	—	Unverified
2	Seamless	PQ	48.5	—	Unverified
3	UPSNet	PQ	47.1	—	Unverified
4	Panoptic FPN	PQ	46.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	EfficientPS	PQ	43.7	—	Unverified
2	Seamless	PQ	42.2	—	Unverified
3	UPSNet	PQ	39.9	—	Unverified
4	Panoptic FPN	PQ	39.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LKCell	PQ	50.8	—	Unverified
2	CellViT-SAM-H	PQ	50.62	—	Unverified
3	TSFD	PQ	50.4	—	Unverified
4	NuLite-H	PQ	49.81	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	OneFormer3D	PQ	71.2	—	Unverified
2	SuperCluster	PQ	58.7	—	Unverified
3	PanopticFusion	PQ	33.5	—	Unverified
4	SceneGraphFusion	PQ	31.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Exchanger+Mask2Former	PQ	52.6	—	Unverified
2	Exchanger+Unet+PaPs	PQ	47.8	—	Unverified
3	U-TAE + PaPs	PQ	40.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	VAN-B6*	PQ	58.2	—	Unverified
2	PFPN (ideal number of groups)	PQ	42.15	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CAFuser (Swin-T)	PQ	59.7	—	Unverified
2	MUSES (Mask2Former /w 4xSwin-T)	PQ	53.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	EMSANet (2x ResNet-34 NBt1D, PanopticNDT version, finetuned)	PQ	51.15	—	Unverified
2	EMSANet	PQ	47.38	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	P3Former	PQ	0.65	—	Unverified
2	DS-Net	PQ	0.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MasQCLIP	PQ	23.3	—	Unverified