SOTAVerified

Instance Segmentation

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Papers

Showing 51100 of 2262 papers

TitleStatusHype
Segment Anything for HistopathologyCode2
iFormer: Integrating ConvNet and Transformer for Mobile ApplicationCode2
RelationField: Relate Anything in Radiance FieldsCode2
MaskTerial: A Foundation Model for Automated 2D Material Flake DetectionCode2
DreamColour: Controllable Video Colour Editing without TrainingCode2
TinyViM: Frequency Decoupling for Tiny Hybrid Vision MambaCode2
DI-MaskDINO: A Joint Object Detection and Instance Segmentation ModelCode2
Fields of The World: A Machine Learning Benchmark Dataset For Global Agricultural Field Boundary SegmentationCode2
One missing piece in Vision and Language: A Survey on Comics UnderstandingCode2
Image Segmentation in Foundation Model Era: A SurveyCode2
CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile ApplicationsCode2
PartGLEE: A Foundation Model for Recognizing and Parsing Any ObjectsCode2
GroupMamba: Efficient Group-Based Visual State Space ModelCode2
Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded ScenesCode2
Adaptive Parametric ActivationCode2
Training-free CryoET Tomogram SegmentationCode2
Context-Aware Video Instance SegmentationCode2
Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale DatasetCode2
Generative Active Learning for Long-tailed Instance SegmentationCode2
Adapting Pre-Trained Vision Models for Novel Instance Detection and SegmentationCode2
DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative DataCode2
GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNsCode2
PTQ4SAM: Post-Training Quantization for Segment AnythingCode2
ViM-UNet: Vision Mamba for Biomedical SegmentationCode2
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt TuningCode2
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTsCode2
Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial ImageryCode2
FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anythingCode2
SPINEPS -- Automatic Whole Spine Segmentation of T2-weighted MR images using a Two-Phase Approach to Multi-class Semantic and Instance SegmentationCode2
FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation ModelsCode2
SHViT: Single-Head Vision Transformer with Memory Efficient Macro DesignCode2
Rethinking Patch Dependence for Masked AutoencodersCode2
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask InpaintingCode2
OBSeg: Accurate and Fast Instance Segmentation Framework Using Segmentation Foundation Models with Oriented Bounding Box PromptsCode2
PartSTAD: 2D-to-3D Part Segmentation Task AdaptationCode2
ODIN: A Single Model for 2D and 3D SegmentationCode2
Unsupervised Universal Image SegmentationCode2
SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion ProcessCode2
SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose EstimationCode2
Adapter is All You Need for Tuning Visual TasksCode2
RMT: Retentive Networks Meet Vision TransformersCode2
DAT++: Spatially Dynamic Vision Transformer with Deformable AttentionCode2
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance SegmentationCode2
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion ModelsCode2
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation ModelCode2
CellViT: Vision Transformers for Precise Cell Segmentation and ClassificationCode2
OpenMask3D: Open-Vocabulary 3D Instance SegmentationCode2
Does Image Anonymization Impact Computer Vision Training?Code2
SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything ModelCode2
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene UnderstandingCode2
Show:102550
← PrevPage 2 of 46Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-HAP5080.8Unverified
2ResNeSt-200 (multi-scale)AP5070.2Unverified
3CenterMask + VoVNetV2-99 (multi-scale)AP5066.2Unverified
4CenterMask + VoVNetV2-57 (single-scale)AP5060.8Unverified
5Co-DETRmask AP57.1Unverified
6CBNetV2 (EVA02, single-scale)mask AP56.1Unverified
7ISDA (ResNet-50)APL55.7Unverified
8EVAmask AP55.5Unverified
9FD-SwinV2-Gmask AP55.4Unverified
10Mask Frozen-DETRmask AP55.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternImage-BGFLOPs501Unverified
2Co-DETRmask AP56.6Unverified
3ViT-CoMer-L (Mask RCNN, DINOv2)mask AP55.9Unverified
4InternImage-Hmask AP55.4Unverified
5EVAmask AP55Unverified
6Mask Frozen-DETRmask AP54.9Unverified
7MasK DINO (SwinL, multi-scale)mask AP54.5Unverified
8ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)mask AP54.2Unverified
9GLEE-Promask AP54.2Unverified
10SwinV2-G (HTC++)mask AP53.7Unverified