SOTAVerified

Video Instance Segmentation

The goal of video instance segmentation is simultaneous detection, segmentation and tracking of instances in videos. In words, it is the first time that the image instance segmentation problem is extended to the video domain.

To facilitate research on this new task, a large-scale benchmark called YouTube-VIS, which consists of 2,883 high-resolution YouTube videos, a 40-category label set and 131k high-quality instance masks is built.

Papers

Showing 150 of 148 papers

TitleStatusHype
Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation0
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects0
SAM2Auto: Auto Annotation Using FLASH0
ThinkVideo: High-Quality Reasoning Video Segmentation with Chain of ThoughtsCode0
FlowCut: Unsupervised Video Instance Segmentation via Temporal Mask Matching0
MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection0
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety0
A Temporal Modeling Framework for Video Pre-Training on Video Instance Segmentation0
Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation0
Decoupled Motion Expression Video Segmentation0
Towards Real-Time Open-Vocabulary Video Instance SegmentationCode0
A2VIS: Amodal-Aware Approach to Video Instance Segmentation0
SyncVIS: Synchronized Video Instance SegmentationCode1
Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps0
SDI-Paste: Synthetic Dynamic Instance Copy-Paste for Video Instance Segmentation0
Foundation Models for Amodal Video Instance Segmentation in Automated DrivingCode0
Are Deep Learning Models Robust to Partial Object Occlusion in Visual Recognition Tasks?0
MouseSIS: A Frames-and-Events Dataset for Space-Time Instance Segmentation of MiceCode1
Improving Weakly-supervised Video Instance Segmentation by Leveraging Spatio-temporal ConsistencyCode1
LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS0
Unified Embedding Alignment for Open-Vocabulary Video Instance SegmentationCode1
Context-Aware Video Instance SegmentationCode2
PM-VIS+: High-Performance Video Instance Segmentation without Video AnnotationCode0
2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation0
UVIS: Unsupervised Video Instance Segmentation0
1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation0
PM-VIS: High-Performance Box-Supervised Video Instance Segmentation0
OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning0
What is Point Supervision Worth in Video Instance Segmentation?0
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor QueriesCode2
InternVideo2: Scaling Foundation Models for Multimodal Video UnderstandingCode7
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance SegmentationCode1
UniVS: Unified and Universal Video Segmentation with Prompts as QueriesCode3
TDViT: Temporal Dilated Video Transformer for Dense Video TasksCode1
Spatio-temporal Prompting Network for Robust Video Feature ExtractionCode1
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance SegmentationCode1
DVIS++: Improved Decoupled Framework for Universal Video SegmentationCode1
General Object Foundation Model for Images and Videos at ScaleCode3
TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance SegmentationCode1
VISAGE: Video Instance Segmentation with Appearance-Guided EnhancementCode1
Video Instance MattingCode1
CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking and Segmentation0
Deep Learning Techniques for Video Instance Segmentation: A Survey0
TCOVIS: Temporally Consistent Online Video Instance SegmentationCode0
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation0
VideoCutLER: Surprisingly Simple Unsupervised Video Instance SegmentationCode3
1st Place Solution for the 5th LSVOS Challenge: Video Instance SegmentationCode1
1st Place Solution for CVPR2023 BURST Long Tail and Open World Challenges0
CTVIS: Consistent Training for Online Video Instance SegmentationCode1
Learning Dynamic Query Combinations for Transformer-based Object Detection and SegmentationCode1
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CAVIS(VIT-L, Offline)mask AP57.1Unverified
2DVIS-DAQ(VIT-L, Offline)mask AP57.1Unverified
3DVIS++(VIT-L,Offline)mask AP53.4Unverified
4GLEE-Promask AP50.4Unverified
5DVIS(Swin-L, Offline)mask AP49.9Unverified
6DVIS++(VIT-L, Online)mask AP49.6Unverified
7UNINEXT (ViT-H, Online)mask AP49Unverified
8DVIS(Swin-L, Online)mask AP47.1Unverified
9CTVIS (Swin-L)mask AP46.9Unverified
10RefineVIS (Swin-L, offline)mask AP46Unverified
#ModelMetricClaimedVerifiedStatus
1CAVIS(ViT-L, Online)mask AP68.9Unverified
2DVIS++(ViT-L, Online)mask AP67.7Unverified
3DVISmask AP64.9Unverified
4Tube-Linkmask AP64.6Unverified
5MinVIS (Swin-L)mask AP61.6Unverified
6Mask2Former (Swin-L)mask AP60.4Unverified
7UniVS(Swin-L)mask AP60Unverified
8MDQE(Swin-L)mask AP59.9Unverified
9SeqFormer (Swin-L)mask AP59.3Unverified
10DeVIS (Swin-L)mask AP57.1Unverified
#ModelMetricClaimedVerifiedStatus
1CAVIS(VIT-L, Offline)mask AP65.3Unverified
2DVIS-DAQ(VIT-L, Offline)mask AP64.5Unverified
3DVIS++(VIT-L, Offline)mask AP63.9Unverified
4DVIS++(VIT-L, Online)mask AP62.3Unverified
5RefineVIS (Swin-L, online)mask AP61.4Unverified
6GRAtt-VIS (Swin-L)mask AP60.3Unverified
7TarViS (Swin-L)mask AP60.2Unverified
8DVIS(Swin-L)mask AP60.1Unverified
9GenVIS (Swin-L)mask AP60.1Unverified
10NOVIS (Swin-L)mask AP59.8Unverified
#ModelMetricClaimedVerifiedStatus
1DVIS++(VIT-L)mAP_L50.9Unverified
2CAVIS (VIT-L)mAP_L48.6Unverified
3CTVIS (Swin-L)mAP_L46.4Unverified
4DVIS(Swin-L)mAP_L45.9Unverified
5CTVIS (ResNet-50)mAP_L39.4Unverified
6InstanceFormer (Swin)mAP_L26.3Unverified
7InstanceFormer (Resnet-50)mAP_L24.8Unverified
#ModelMetricClaimedVerifiedStatus
1PCANmMOTSA27.4Unverified
2QDTrack-mots-fixmMOTSA23.5Unverified
3QDTrack-motsmMOTSA22.5Unverified
4MaskTrackRCNNmMOTSA12.3Unverified
5STEm-SegmMOTSA12.2Unverified
6SortIoUmMOTSA10.3Unverified
#ModelMetricClaimedVerifiedStatus
1VMT (Swin-L)Tube-Boundary AP44.8Unverified
2SeqFormer (Swin-L)Tube-Boundary AP43.3Unverified
3VMT (R101)Tube-Boundary AP32.5Unverified
4VMT (R50)Tube-Boundary AP30.7Unverified
#ModelMetricClaimedVerifiedStatus
1Temporal ROI Alignmask AP38Unverified
#ModelMetricClaimedVerifiedStatus
1MaskFreeVISAP55.3Unverified