Panoptic Segmentation
Panoptic Segmentation is a computer vision task that combines semantic segmentation and instance segmentation to provide a comprehensive understanding of the scene. The goal of panoptic segmentation is to segment the image into semantically meaningful parts or regions, while also detecting and distinguishing individual instances of objects within those regions. In a given image, every pixel is assigned a semantic label, and pixels belonging to "things" classes (countable objects with instances, like cars and people) are assigned unique instance IDs. ( Image credit: Detectron2 )
Papers
Showing 81–90 of 462 papers
All datasetsCOCO test-devCityscapes valCOCO minivalADE20K valMapillary valCityscapes testLaRSS3DIS Area5ScanNetV2Indian Driving DatasetKITTI Panoptic SegmentationPanNuke
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Mask DINO (single scale) | PQ | 59.5 | — | Unverified |
| 2 | kMaX-DeepLab (single-scale) | PQ | 58.5 | — | Unverified |
| 3 | Mask2Former (Swin-L) | PQ | 58.3 | — | Unverified |
| 4 | Panoptic SegFormer (Swin-L) | PQ | 56.2 | — | Unverified |
| 5 | Panoptic SegFormer (PVTv2-B5) | PQ | 55.8 | — | Unverified |
| 6 | CMT-DeepLab (single-scale) | PQ | 55.7 | — | Unverified |
| 7 | K-Net (Swin-L) | PQ | 55.2 | — | Unverified |
| 8 | MaskConver (ResNet50, single-scale) | PQ | 53.6 | — | Unverified |
| 9 | MaskFormer (Swin-L) | PQ | 53.3 | — | Unverified |
| 10 | Panoptic FCN* (Swin-L) | PQ | 52.7 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ViT-P (OneFormer, InternImage-H) | PQ | 70.8 | — | Unverified |
| 2 | Panoptic FCN* (Swin-L, Cityscapes-fine) | PQst | 70.6 | — | Unverified |
| 3 | OneFormer (ConvNeXt-L, single-scale, 512x1024, Mapillary Vistas-pretrained) | PQ | 70.1 | — | Unverified |
| 4 | Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, multi-scale) | PQ | 69.6 | — | Unverified |
| 5 | OneFormer (ConvNeXt-L, single-scale) | PQ | 68.51 | — | Unverified |
| 6 | Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary Vistas, single-scale) | PQ | 68.5 | — | Unverified |
| 7 | Axial-DeepLab-XL (Mapillary Vistas, multi-scale) | PQ | 68.5 | — | Unverified |
| 8 | kMaX-DeepLab (single-scale) | PQ | 68.4 | — | Unverified |
| 9 | OneFormer (ConvNeXt-XL, single-scale) | PQ | 68.4 | — | Unverified |
| 10 | AFF-Base (single-scale, point-based Mask2Former) | PQ | 67.7 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | HyperSeg (Swin-B) | PQ | 61.2 | — | Unverified |
| 2 | OneFormer (InternImage-H,single-scale) | PQ | 60 | — | Unverified |
| 3 | UMG-CLIP-E/14 | PQ | 59.5 | — | Unverified |
| 4 | OpenSeeD (SwinL, single-scale) | PQ | 59.5 | — | Unverified |
| 5 | MasK DINO (SwinL,single-scale) | PQ | 59.4 | — | Unverified |
| 6 | EoMT (DINOv2-g, single-scale, 1280x1280) | PQ | 59.2 | — | Unverified |
| 7 | UMG-CLIP-L/14 | PQ | 58.9 | — | Unverified |
| 8 | Panoptic FCN* (Swin-L, single-scale) | PQth | 58.5 | — | Unverified |
| 9 | DiNAT-L (single-scale, Mask2Former) | PQ | 58.5 | — | Unverified |
| 10 | ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former) | PQ | 58.4 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896) | PQ | 54.5 | — | Unverified |
| 2 | ViT-P (OneFormer, DiNAT-L, single-scale, 1280x1280, COCO_pretrain) | PQ | 54 | — | Unverified |
| 3 | OpenSeed(SwinL, single scale, 1280x1280) | PQ | 53.7 | — | Unverified |
| 4 | OneFormer (DiNAT-L, single-scale, 1280x1280, COCO-Pretrain) | PQ | 53.4 | — | Unverified |
| 5 | EoMT (DINOv2-g, single-scale, 1280x1280, COCO pre-trained) | PQ | 52.8 | — | Unverified |
| 6 | X-Decoder (Davit-d5, Deform, single-scale, 1280x1280) | PQ | 52.4 | — | Unverified |
| 7 | ViT-P (OneFormer, DiNAT-L, single-scale, 1280x1280) | PQ | 51.9 | — | Unverified |
| 8 | OneFormer (DiNAT-L, single-scale, 1280x1280) | PQ | 51.5 | — | Unverified |
| 9 | OneFormer (Swin-L, single-scale, 1280x1280) | PQ | 51.4 | — | Unverified |
| 10 | kMaX-DeepLab (ConvNeXt-L, single-scale, 1281x1281) | PQ | 50.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | OneFormer (DiNAT-L, single-scale) | PQ | 46.7 | — | Unverified |
| 2 | OneFormer (ConvNeXt-L, single-scale) | PQ | 46.4 | — | Unverified |
| 3 | Panoptic FCN* (Swin-L, single-scale) | PQ | 45.7 | — | Unverified |
| 4 | Panoptic-DeepLab (SWideRNet-(1, 1, 4.5), multi-scale) | PQ | 44.8 | — | Unverified |
| 5 | Panoptic FCN* (ResNet-50-FPN) | PQst | 42.3 | — | Unverified |
| 6 | Mask2Former + Intra-Batch Supervision (ResNet-50) | PQ | 42.2 | — | Unverified |
| 7 | Axial-DeepLab-L (multi-scale) | PQ | 41.1 | — | Unverified |
| 8 | EfficientPS | PQ | 40.6 | — | Unverified |
| 9 | Panoptic-DeepLab (X71) | PQ | 40.5 | — | Unverified |
| 10 | AdaptIS (ResNeXt-101) | PQ | 40.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | OneFormer (ConvNeXt-L, single-scale, Mapillary Vistas-Pretrained) | PQ | 68 | — | Unverified |
| 2 | Panoptic-DeepLab (SWideRNet [1, 1, 4.5], Mapillary, multi-scale) | PQ | 67.8 | — | Unverified |
| 3 | EfficientPS | PQ | 67.1 | — | Unverified |
| 4 | Axial-DeepLab-XL (Mapillary Vistas, multi-scale) | PQ | 66.6 | — | Unverified |
| 5 | kMaX-DeepLab (single-scale) | PQ | 66.2 | — | Unverified |
| 6 | Panoptic-Deeplab | PQ | 65.5 | — | Unverified |
| 7 | EfficientPS (Cityscapes-fine) | PQ | 62.9 | — | Unverified |
| 8 | COPS (ResNet-50) | PQ | 60 | — | Unverified |
| 9 | SOGNet (ResNet-50) | PQ | 60 | — | Unverified |
| 10 | Dynamically Instantiated Network | PQ | 55.4 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Mask2Former (Swin-B) | PQ | 41.7 | — | Unverified |
| 2 | Panoptic FPN (ResNet-50) | PQ | 40.1 | — | Unverified |
| 3 | Mask2Former (Swin-T) | PQ | 39.2 | — | Unverified |
| 4 | Panoptic FPN (ResNet-101) | PQ | 38.7 | — | Unverified |
| 5 | Mask2Former (ResNet-50) | PQ | 37.6 | — | Unverified |
| 6 | Mask2Former (ResNet-101) | PQ | 37.2 | — | Unverified |
| 7 | Panoptic Deeplab (ResNet-50) | PQ | 34.7 | — | Unverified |
| 8 | MaX-DeepLab | PQ | 31.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SuperCluster | PQ | 50.1 | — | Unverified |
| 2 | PointGroup (Xiang 2023) | PQ | 42.3 | — | Unverified |
| 3 | KPConv (Xiang 2023) | PQ | 41.8 | — | Unverified |
| 4 | MinkowskiNet (Xiang 2023) | PQ | 39.2 | — | Unverified |
| 5 | PointNet++ (Xiang 2023) | PQ | 24.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | OneFormer3D | PQ | 71.2 | — | Unverified |
| 2 | PanopticNDT (10cm) | PQ | 59.19 | — | Unverified |
| 3 | SuperCluster | PQ | 58.7 | — | Unverified |
| 4 | PanopticFusion (with CRF) | PQ | 33.5 | — | Unverified |
| 5 | SceneGraphFusion (NN mapping) | PQ | 31.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | EfficientPS | PQ | 51.1 | — | Unverified |
| 2 | Seamless | PQ | 48.5 | — | Unverified |
| 3 | UPSNet | PQ | 47.1 | — | Unverified |
| 4 | Panoptic FPN | PQ | 46.7 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | EfficientPS | PQ | 43.7 | — | Unverified |
| 2 | Seamless | PQ | 42.2 | — | Unverified |
| 3 | UPSNet | PQ | 39.9 | — | Unverified |
| 4 | Panoptic FPN | PQ | 39.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | LKCell | PQ | 50.8 | — | Unverified |
| 2 | CellViT-SAM-H | PQ | 50.62 | — | Unverified |
| 3 | TSFD | PQ | 50.4 | — | Unverified |
| 4 | NuLite-H | PQ | 49.81 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | OneFormer3D | PQ | 71.2 | — | Unverified |
| 2 | SuperCluster | PQ | 58.7 | — | Unverified |
| 3 | PanopticFusion | PQ | 33.5 | — | Unverified |
| 4 | SceneGraphFusion | PQ | 31.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Exchanger+Mask2Former | PQ | 52.6 | — | Unverified |
| 2 | Exchanger+Unet+PaPs | PQ | 47.8 | — | Unverified |
| 3 | U-TAE + PaPs | PQ | 40.4 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | VAN-B6* | PQ | 58.2 | — | Unverified |
| 2 | PFPN (ideal number of groups) | PQ | 42.15 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | CAFuser (Swin-T) | PQ | 59.7 | — | Unverified |
| 2 | MUSES (Mask2Former /w 4xSwin-T) | PQ | 53.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | EMSANet (2x ResNet-34 NBt1D, PanopticNDT version, finetuned) | PQ | 51.15 | — | Unverified |
| 2 | EMSANet | PQ | 47.38 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | MasQCLIP | PQ | 23.3 | — | Unverified |