SOTAVerified

Instance Segmentation

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Papers

Showing 451500 of 2262 papers

TitleStatusHype
3D Instances as 1D KernelsCode1
Efficient Multi-Task RGB-D Scene Analysis for Indoor EnvironmentsCode1
Vision-based Uneven BEV Representation Learning with Polar Rasterization and Surface EstimationCode1
OSFormer: One-Stage Camouflaged Instance Segmentation with TransformersCode1
Segmenting Moving Objects via an Object-Centric Layered RepresentationCode1
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid NetworkCode1
Learn Fast, Segment Well: Fast Object Segmentation Learning on the iCub RobotCode1
Patch-level Representation Learning for Self-supervised Vision TransformersCode1
RF-Next: Efficient Receptive Field Search for Convolutional Neural NetworksCode1
VITA: Video Instance Segmentation via Object Token AssociationCode1
SHRED: 3D Shape Region Decomposition with Learned Local OperationsCode1
Metrics reloaded: Recommendations for image analysis validationCode1
Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural ReparameterizationCode1
Efficient Self-supervised Vision Pretraining with Local Masked ReconstructionCode1
Self-Supervised Visual Representation Learning with Semantic GroupingCode1
UniInst: Unique Representation for End-to-End Instance SegmentationCode1
Human Instance Matting via Mutual Guidance and Multi-Instance RefinementCode1
HCFormer: Unified Image Segmentation with Hierarchical ClusteringCode1
Masked Image Modeling with Denoising ContrastCode1
Plane Geometry Diagram ParsingCode1
GRIT: General Robust Image Task BenchmarkCode1
PolyLoss: A Polynomial Expansion Perspective of Classification Loss FunctionsCode1
Modeling Missing Annotations for Incremental Learning in Object DetectionCode1
Interactive Object Segmentation in 3D Point CloudsCode1
Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise AffinityCode1
Video K-Net: A Simple, Strong, and Unified Baseline for Video SegmentationCode1
Tooth Instance Segmentation on Panoramic Dental Radiographs Using U-Nets and Morphological ProcessingCode1
mc-BEiT: Multi-choice Discretization for Image BERT Pre-trainingCode1
Eigencontours: Novel Contour Descriptors Based on Low-Rank ApproximationCode1
CHEX: CHannel EXploration for CNN Model CompressionCode1
SepViT: Separable Vision TransformerCode1
Noisy Boundaries: Lemon or Lemonade for Semi-supervised Instance Segmentation?Code1
Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention TransformerCode1
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision TransformersCode1
Test-time Adaptation with Slot-Centric ModelsCode1
ContrastMask: Contrastive Learning to Segment Every ThingCode1
Active Token MixerCode1
RankSeg: Adaptive Pixel Classification with Image Category Ranking for SegmentationCode1
Instance Segmentation for Autonomous Log Grasping in Forestry OperationsCode1
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN TrainingCode1
The Winning Solution to the iFLYTEK Challenge 2021 Cultivated Land Extraction from High-Resolution Remote Sensing ImageCode1
SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask RepresentationsCode1
Weakly Supervised Nuclei Segmentation via Instance LearningCode1
DocSegTr: An Instance-Level End-to-End Document Image Segmentation TransformerCode1
Relieving Long-tailed Instance Segmentation via Pairwise Class BalanceCode1
WALT: Watch and Learn 2D Amodal Representation From Time-Lapse ImageryCode1
MSeg: A Composite Dataset for Multi-domain Semantic SegmentationCode1
ELSA: Enhanced Local Self-Attention for Vision TransformerCode1
SOIT: Segmenting Objects with Instance-Aware TransformersCode1
MPViT: Multi-Path Vision Transformer for Dense PredictionCode1
Show:102550
← PrevPage 10 of 46Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-HAP5080.8Unverified
2ResNeSt-200 (multi-scale)AP5070.2Unverified
3CenterMask + VoVNetV2-99 (multi-scale)AP5066.2Unverified
4CenterMask + VoVNetV2-57 (single-scale)AP5060.8Unverified
5Co-DETRmask AP57.1Unverified
6CBNetV2 (EVA02, single-scale)mask AP56.1Unverified
7ISDA (ResNet-50)APL55.7Unverified
8EVAmask AP55.5Unverified
9FD-SwinV2-Gmask AP55.4Unverified
10Mask Frozen-DETRmask AP55.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternImage-BGFLOPs501Unverified
2Co-DETRmask AP56.6Unverified
3ViT-CoMer-L (Mask RCNN, DINOv2)mask AP55.9Unverified
4InternImage-Hmask AP55.4Unverified
5EVAmask AP55Unverified
6Mask Frozen-DETRmask AP54.9Unverified
7MasK DINO (SwinL, multi-scale)mask AP54.5Unverified
8GLEE-Promask AP54.2Unverified
9ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)mask AP54.2Unverified
10SwinV2-G (HTC++)mask AP53.7Unverified