SOTAVerified

Instance Segmentation

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Papers

Showing 10511100 of 2262 papers

TitleStatusHype
Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing0
ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers0
Robot Instance Segmentation with Few Annotations for GraspingCode0
PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction0
PM-VIS+: High-Performance Video Instance Segmentation without Video AnnotationCode0
3D Feature Distillation with Object-Centric Priors0
CoDA: Interactive Segmentation and Morphological Analysis of Dendroid Structures Exemplified on Stony Cold-Water CoralsCode0
XAMI -- A Benchmark Dataset for Artefact Detection in XMM-Newton Optical ImagesCode0
Optimization of Autonomous Driving Image Detection Based on RFAConv and Triplet Attention0
Depth-Guided Semi-Supervised Instance Segmentation0
Semi-supervised classification of dental conditions in panoramic radiographs using large language model and instance segmentation: A real-world dataset evaluation0
GMT: Guided Mask Transformer for Leaf Instance SegmentationCode0
Fine-grained Background Representation for Weakly Supervised Semantic SegmentationCode0
TraceNet: Segment one thing efficiently0
2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation0
3D Instance Segmentation Using Deep Learning on RGB-D Indoor Data0
Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines0
Benchmarking Label Noise in Instance Segmentation: Spatial Noise MattersCode0
MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor PerceptionCode0
2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation0
PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving0
UVIS: Unsupervised Video Instance Segmentation0
Dual Thinking and Logical Processing -- Are Multi-modal Large Language Models Closing the Gap with Human Vision ?Code0
RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks0
1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation0
Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment0
MultiPly: Reconstruction of Multiple People from Monocular Video in the Wild0
Layout Agnostic Scene Text Image Synthesis with Diffusion Models0
MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images0
An expert-driven data generation pipeline for histological imagesCode0
From Seedling to Harvest: The GrowingSoy Dataset for Weed Detection in Soy Crops via Instance SegmentationCode0
Extreme Point Supervised Instance Segmentation0
OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation0
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models0
BAISeg: Boundary Assisted Weakly Supervised Instance SegmentationCode0
Understanding the Effect of using Semantically Meaningful Tokens for Visual Representation Learning0
Video Prediction Models as General Visual Encoders0
Efficient Temporal Action Segmentation via Boundary-aware Query VotingCode0
Autonomous Quilt Spreading for Caregiving Robots0
Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual RepresentationsCode0
Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance SegmentationCode0
Vision Transformer with Sparse Scan PriorCode0
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens0
Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model0
Unifying 3D Vision-Language Understanding via Promptable Queries0
UDA4Inst: Unsupervised Domain Adaptation for Instance Segmentation0
PLUTO: Pathology-Universal Transformer0
PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and ClassificationCode0
Global Motion Understanding in Large-Scale Video Object Segmentation0
CSA-Net: Channel-wise Spatially Autocorrelated Attention NetworksCode0
Show:102550
← PrevPage 22 of 46Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1InternImage-HAP5080.8Unverified
2ResNeSt-200 (multi-scale)AP5070.2Unverified
3CenterMask + VoVNetV2-99 (multi-scale)AP5066.2Unverified
4CenterMask + VoVNetV2-57 (single-scale)AP5060.8Unverified
5Co-DETRmask AP57.1Unverified
6CBNetV2 (EVA02, single-scale)mask AP56.1Unverified
7ISDA (ResNet-50)APL55.7Unverified
8EVAmask AP55.5Unverified
9FD-SwinV2-Gmask AP55.4Unverified
10Mask Frozen-DETRmask AP55.3Unverified
#ModelMetricClaimedVerifiedStatus
1InternImage-BGFLOPs501Unverified
2Co-DETRmask AP56.6Unverified
3ViT-CoMer-L (Mask RCNN, DINOv2)mask AP55.9Unverified
4InternImage-Hmask AP55.4Unverified
5EVAmask AP55Unverified
6Mask Frozen-DETRmask AP54.9Unverified
7MasK DINO (SwinL, multi-scale)mask AP54.5Unverified
8GLEE-Promask AP54.2Unverified
9ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)mask AP54.2Unverified
10SwinV2-G (HTC++)mask AP53.7Unverified