SOTAVerified

Instance Segmentation

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Papers

Showing 401450 of 2262 papers

TitleStatusHype
AdaContour: Adaptive Contour Descriptor with Hierarchical RepresentationCode0
Practical Guidelines for Cell Segmentation Models Under Optical Aberrations in Microscopy0
Let-It-Flow: Simultaneous Optimization of 3D Flow and Object ClusteringCode1
ViM-UNet: Vision Mamba for Biomedical SegmentationCode2
Automated National Urban Map Extraction0
Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation0
Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation0
OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning0
CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks0
Segment Any 3D Object with Language0
Instance-Aware Group Quantization for Vision Transformers0
SUGAR: Pre-training 3D Visual Representations for Robotics0
Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Anthropic Prior Knowledge0
What is Point Supervision Worth in Video Instance Segmentation?0
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt TuningCode2
Efficient 3D Instance Mapping and Localization with Neural Fields0
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTsCode2
Annolid: Annotate, Segment, and Track Anything You NeedCode0
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series AnalysisCode1
PlainMamba: Improving Non-Hierarchical Mamba in Visual RecognitionCode3
GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation0
AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D ScansCode1
Language-Based Depth Hints for Monocular Depth Estimation0
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting0
ParFormer: A Vision Transformer with Parallel Mixer and Sparse Channel Attention Patch Embedding0
BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance SegmentationCode1
MTP: Advancing Remote Sensing Foundation Model via Multi-Task PretrainingCode3
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance SegmentationCode1
EffiPerception: an Efficient Framework for Various Perception Tasks0
Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial ImageryCode2
Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes0
Circle Representation for Medical Instance Object SegmentationCode0
Better (pseudo-)labels for semi-supervised instance segmentation0
ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance SegmentationCode0
MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation0
Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation0
Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects0
When Semantic Segmentation Meets Frequency AliasingCode1
WeakSurg: Weakly supervised surgical instrument segmentation using temporal equivariance and semantic continuity0
StainFuser: Controlling Diffusion for Faster Neural Style Transfer in Multi-Gigapixel Histology ImagesCode1
Task-Specific Adaptation of Segmentation Foundation Model via Prompt Learning0
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense PredictionsCode3
Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration0
SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in Videos by Prompt DenoisingCode0
CenterDisks: Real-time instance segmentation with disk coveringCode0
MCA: Moment Channel Attention NetworksCode0
RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant Features0
End-to-End Human Instance MattingCode1
Self-Supervised Representation Learning with Meta Comprehensive Regularization0
Boosting Box-supervised Instance Segmentation with Pseudo Depth0
Show:102550
← PrevPage 9 of 46Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-HAP5080.8Unverified
2ResNeSt-200 (multi-scale)AP5070.2Unverified
3CenterMask + VoVNetV2-99 (multi-scale)AP5066.2Unverified
4CenterMask + VoVNetV2-57 (single-scale)AP5060.8Unverified
5Co-DETRmask AP57.1Unverified
6CBNetV2 (EVA02, single-scale)mask AP56.1Unverified
7ISDA (ResNet-50)APL55.7Unverified
8EVAmask AP55.5Unverified
9FD-SwinV2-Gmask AP55.4Unverified
10Mask Frozen-DETRmask AP55.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternImage-BGFLOPs501Unverified
2Co-DETRmask AP56.6Unverified
3ViT-CoMer-L (Mask RCNN, DINOv2)mask AP55.9Unverified
4InternImage-Hmask AP55.4Unverified
5EVAmask AP55Unverified
6Mask Frozen-DETRmask AP54.9Unverified
7MasK DINO (SwinL, multi-scale)mask AP54.5Unverified
8ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)mask AP54.2Unverified
9GLEE-Promask AP54.2Unverified
10SwinV2-G (HTC++)mask AP53.7Unverified