SOTAVerified

Instance Segmentation

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Papers

Showing 101150 of 2262 papers

TitleStatusHype
Mask-Free Video Instance SegmentationCode2
FastInst: A Simple Query-Based Model for Real-Time Instance SegmentationCode2
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale AttentionCode2
DiffusionInst: Diffusion Model for Instance SegmentationCode2
Box2Mask: Box-supervised Instance Segmentation via Level-set EvolutionCode2
PLA: Language-Driven Open-Vocabulary 3D Scene UnderstandingCode2
MogaNet: Multi-order Gated Aggregation NetworkCode2
What the DAAM: Interpreting Stable Diffusion Using Cross AttentionCode2
Mask3D: Mask Transformer for 3D Semantic Instance SegmentationCode2
Dilated Neighborhood Attention TransformerCode2
Scalable SoftGroup for 3D Instance Segmentation on Point CloudsCode2
FEC: Fast Euclidean Clustering for Point Cloud SegmentationCode2
Occlusion-Aware Instance Segmentation via BiLayer Network ArchitecturesCode2
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based TrainingCode2
In Defense of Online Models for Video Instance SegmentationCode2
Box-supervised Instance Segmentation with Level Set EvolutionCode2
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation LearningCode2
Global Context Vision TransformersCode2
What Are Expected Queries in End-to-End Object Detection?Code2
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature DistillationCode2
Masked Generative DistillationCode2
Temporally Efficient Vision Transformer for Video Instance SegmentationCode2
VSA: Learning Varied-Size Window Attention in Vision TransformersCode2
DaViT: Dual Attention Vision TransformersCode2
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object DetectionCode2
Exploring Plain Vision Transformer Backbones for Object DetectionCode2
Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene SegmentationCode2
Sparse Instance Activation for Real-Time Instance SegmentationCode2
E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance SegmentationCode2
SoftGroup for 3D Instance Segmentation on Point CloudsCode2
FreeSOLO: Learning to Segment Objects without AnnotationsCode2
Context Autoencoder for Self-Supervised Representation LearningCode2
Mask2Former for Video Instance SegmentationCode2
Masked-attention Mask Transformer for Universal Image SegmentationCode2
Revisiting Contrastive Methods for Unsupervised Learning of Visual RepresentationsCode2
Beyond Self-attention: External Attention using Two Linear Layers for Visual TasksCode2
Swin Transformer: Hierarchical Vision Transformer using Shifted WindowsCode2
LambdaNetworks: Modeling Long-Range Interactions Without AttentionCode2
Bottleneck Transformers for Visual RecognitionCode2
Simplifying Object Segmentation with PixelLib LibraryCode2
Global Context NetworksCode2
YolactEdge: Real-time Instance Segmentation on the EdgeCode2
DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous ConvolutionCode2
SOLOv2: Dynamic and Fast Instance SegmentationCode2
Deep Snake for Real-Time Instance SegmentationCode2
BlenderProcCode2
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural NetworksCode2
Video Instance SegmentationCode2
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and BeyondCode2
Multi-Task Learning as Multi-Objective OptimizationCode2
Show:102550
← PrevPage 3 of 46Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-HAP5080.8Unverified
2ResNeSt-200 (multi-scale)AP5070.2Unverified
3CenterMask + VoVNetV2-99 (multi-scale)AP5066.2Unverified
4CenterMask + VoVNetV2-57 (single-scale)AP5060.8Unverified
5Co-DETRmask AP57.1Unverified
6CBNetV2 (EVA02, single-scale)mask AP56.1Unverified
7ISDA (ResNet-50)APL55.7Unverified
8EVAmask AP55.5Unverified
9FD-SwinV2-Gmask AP55.4Unverified
10Mask Frozen-DETRmask AP55.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternImage-BGFLOPs501Unverified
2Co-DETRmask AP56.6Unverified
3ViT-CoMer-L (Mask RCNN, DINOv2)mask AP55.9Unverified
4InternImage-Hmask AP55.4Unverified
5EVAmask AP55Unverified
6Mask Frozen-DETRmask AP54.9Unverified
7MasK DINO (SwinL, multi-scale)mask AP54.5Unverified
8ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)mask AP54.2Unverified
9GLEE-Promask AP54.2Unverified
10SwinV2-G (HTC++)mask AP53.7Unverified