Video Instance Segmentation

The goal of video instance segmentation is simultaneous detection, segmentation and tracking of instances in videos. In words, it is the first time that the image instance segmentation problem is extended to the video domain.

To facilitate research on this new task, a large-scale benchmark called YouTube-VIS, which consists of 2,883 high-resolution YouTube videos, a 40-category label set and 131k high-quality instance masks is built.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 148 papers

Title	Date	Tasks	Status
Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation	Jul 8, 2025	Depth EstimationDepth Prediction	—Unverified
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects	Jun 16, 2025	BenchmarkingInstance Segmentation	—Unverified
SAM2Auto: Auto Annotation Using FLASH	Jun 9, 2025	Instance SegmentationObject	—Unverified
ThinkVideo: High-Quality Reasoning Video Segmentation with Chain of Thoughts	May 24, 2025	Image SegmentationInstance Segmentation	CodeCode Available
FlowCut: Unsupervised Video Instance Segmentation via Temporal Mask Matching	May 19, 2025	Instance SegmentationSegmentation	—Unverified
MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection	Apr 30, 2025	Instance SegmentationInteractive Segmentation	—Unverified
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety	Apr 1, 2025	Instance SegmentationSegmentation	—Unverified
A Temporal Modeling Framework for Video Pre-Training on Video Instance Segmentation	Mar 22, 2025	Instance SegmentationSemantic Segmentation	—Unverified
Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation	Jan 1, 2025	Instance SegmentationSemantic Segmentation	—Unverified
Decoupled Motion Expression Video Segmentation	Jan 1, 2025	Instance SegmentationReferring Video Object Segmentation	—Unverified

Show:10 25 50

← PrevPage 1 of 15Next →

All datasets OVIS validation YouTube-VIS validation YouTube-VIS 2021 Youtube-VIS 2022 Validation BDD100K val HQ-YTVIS YouTube-VIS Youtube-VIS (trained with no video masks)

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CAVIS(ViT-L, Online)	mask AP	68.9	—	Unverified
2	DVIS++(ViT-L, Online)	mask AP	67.7	—	Unverified
3	DVIS	mask AP	64.9	—	Unverified
4	Tube-Link	mask AP	64.6	—	Unverified
5	MinVIS (Swin-L)	mask AP	61.6	—	Unverified
6	Mask2Former (Swin-L)	mask AP	60.4	—	Unverified
7	UniVS(Swin-L)	mask AP	60	—	Unverified
8	MDQE(Swin-L)	mask AP	59.9	—	Unverified
9	SeqFormer (Swin-L)	mask AP	59.3	—	Unverified
10	DeVIS (Swin-L)	mask AP	57.1	—	Unverified