SOTAVerified

Video Instance Segmentation

The goal of video instance segmentation is simultaneous detection, segmentation and tracking of instances in videos. In words, it is the first time that the image instance segmentation problem is extended to the video domain.

To facilitate research on this new task, a large-scale benchmark called YouTube-VIS, which consists of 2,883 high-resolution YouTube videos, a 40-category label set and 131k high-quality instance masks is built.

Papers

Showing 110 of 148 papers

TitleStatusHype
Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation0
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects0
SAM2Auto: Auto Annotation Using FLASH0
ThinkVideo: High-Quality Reasoning Video Segmentation with Chain of ThoughtsCode0
FlowCut: Unsupervised Video Instance Segmentation via Temporal Mask Matching0
MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection0
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety0
A Temporal Modeling Framework for Video Pre-Training on Video Instance Segmentation0
Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation0
Decoupled Motion Expression Video Segmentation0
Show:102550
← PrevPage 1 of 15Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Temporal ROI Alignmask AP38Unverified