SOTAVerified

Instance Segmentation

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Papers

Showing 501525 of 2262 papers

TitleStatusHype
D2Det: Towards High Quality Object Detection and Instance SegmentationCode1
GradAug: A New Regularization Method for Deep Neural NetworksCode1
AISFormer: Amodal Instance Segmentation with TransformerCode1
DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box SupervisionCode1
Beyond Semantic to Instance Segmentation: Weakly-Supervised Instance Segmentation via Semantic Knowledge Transfer and Self-RefinementCode1
Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic SegmentationCode1
Exploring Classification Equilibrium in Long-Tailed Object DetectionCode1
Panoptic Vision-Language Feature FieldsCode1
PartDistillation: Learning Parts From Instance SegmentationCode1
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuningCode1
Evolving Normalization-Activation LayersCode1
Explain Any Concept: Segment Anything Meets Concept-Based ExplanationCode1
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene ContextsCode1
Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object DetectionCode1
Amodal Intra-class Instance Segmentation: Synthetic Datasets and BenchmarkCode1
Evaluation Study on SAM 2 for Class-agnostic Instance-level SegmentationCode1
Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance SegmentationCode1
3D Instances as 1D KernelsCode1
EPSNet: Efficient Panoptic Segmentation Network with Cross-layer Attention FusionCode1
EViT: An Eagle Vision Transformer with Bi-Fovea Self-AttentionCode1
End-to-End Semi-Supervised Object Detection with Soft TeacherCode1
End-to-End Referring Video Object Segmentation with Multimodal TransformersCode1
End-to-End Video Instance Segmentation with TransformersCode1
Evaluation of Segment Anything Model 2: The Role of SAM2 in the Underwater EnvironmentCode1
Benchmarking Self-Supervised Learning on Diverse Pathology DatasetsCode1
Show:102550
← PrevPage 21 of 91Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-HAP5080.8Unverified
2ResNeSt-200 (multi-scale)AP5070.2Unverified
3CenterMask + VoVNetV2-99 (multi-scale)AP5066.2Unverified
4CenterMask + VoVNetV2-57 (single-scale)AP5060.8Unverified
5Co-DETRmask AP57.1Unverified
6CBNetV2 (EVA02, single-scale)mask AP56.1Unverified
7ISDA (ResNet-50)APL55.7Unverified
8EVAmask AP55.5Unverified
9FD-SwinV2-Gmask AP55.4Unverified
10Mask Frozen-DETRmask AP55.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternImage-BGFLOPs501Unverified
2Co-DETRmask AP56.6Unverified
3ViT-CoMer-L (Mask RCNN, DINOv2)mask AP55.9Unverified
4InternImage-Hmask AP55.4Unverified
5EVAmask AP55Unverified
6Mask Frozen-DETRmask AP54.9Unverified
7MasK DINO (SwinL, multi-scale)mask AP54.5Unverified
8GLEE-Promask AP54.2Unverified
9ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)mask AP54.2Unverified
10SwinV2-G (HTC++)mask AP53.7Unverified