SOTAVerified

Video Object Segmentation

Video object segmentation is a binary labeling problem aiming to separate foreground object(s) from the background region of a video.

For leaderboards please refer to the different subtasks.

Papers

Showing 150 of 551 papers

TitleStatusHype
SAM 2: Segment Anything in Images and VideosCode11
Efficient Track AnythingCode7
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video SegmentationCode5
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and VideosCode5
4th PVUW MeViS 3rd Place Report: Sa2VACode5
OMG-Seg: Is One Model Good Enough For All Segmentation?Code5
SiamMask: A Framework for Fast Online Object Tracking and SegmentationCode4
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory TreeCode4
SegGPT: Segmenting Everything In ContextCode4
PVUW 2024 Challenge on Complex Video Understanding: Methods and ResultsCode4
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory ModelCode3
VISA: Reasoning Video Object Segmentation via Large Language ModelsCode3
UniVS: Unified and Universal Video Segmentation with Prompts as QueriesCode3
Tracking Anything with Decoupled Video SegmentationCode3
Segment Anything Meets Point TrackingCode3
Putting the Object Back into Video Object SegmentationCode3
SMITE: Segment Me In TimECode3
PSALM: Pixelwise SegmentAtion with Large Multi-Modal ModelCode3
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video SegmentationCode3
General Object Foundation Model for Images and Videos at ScaleCode3
Moving Object Segmentation: All You Need Is SAM (and Flow)Code3
Personalize Segment Anything Model with One ShotCode3
XMem++: Production-level Video Segmentation From Few Annotated FramesCode2
Efficient Video Object Segmentation via Modulated Cross-Attention MemoryCode2
Vivim: a Video Vision Mamba for Medical Video SegmentationCode2
VideoMolmo: Spatio-Temporal Grounding Meets PointingCode2
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object SegmentationCode2
Video Object Segmentation in Panoptic Wild ScenesCode2
VLT: Vision-Language Transformer and Query Generation for Referring SegmentationCode2
UniRef++: Segment Every Reference Object in Spatial and Temporal SpacesCode2
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object SegmentationCode2
Tracking Anything in High QualityCode2
Video Polyp Segmentation: A Deep Learning PerspectiveCode2
Scalable Video Object Segmentation with Identification MechanismCode2
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System CollaborationCode2
One Token to Seg Them All: Language Instructed Reasoning Segmentation in VideosCode2
LVOS: A Benchmark for Large-scale Long-term Video Object SegmentationCode2
Decoupling Features in Hierarchical Propagation for Video Object SegmentationCode2
MeViS: A Large-scale Benchmark for Video Segmentation with Motion ExpressionsCode2
In Defense of Online Models for Video Instance SegmentationCode2
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet VideosCode2
Language as Queries for Referring Video Object SegmentationCode2
MOSE: A New Dataset for Video Object Segmentation in Complex ScenesCode2
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video SegmentationCode2
Fast Online Object Tracking and Segmentation: A Unifying ApproachCode2
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object SegmentationCode2
HyperSeg: Towards Universal Visual Segmentation with Large Language ModelCode2
A Deeper Dive Into What Deep Spatiotemporal Networks Encode: Quantifying Static vs. Dynamic InformationCode1
Accelerating Video Object Segmentation with Compressed VideoCode1
Emerging Properties in Self-Supervised Vision TransformersCode1
Show:102550
← PrevPage 1 of 12Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AOC-MF (val)F-Score94.7Unverified
2ISVOS (BL30K, MS)J&F93.4Unverified
3XMem (BL30K, MS)J&F93.3Unverified
4BATMAN (val)J&F92.5Unverified
5STCN (val)J&F91.6Unverified
6XMemJ&F91.5Unverified
7MobileVOS (val)J&F91.4Unverified
8AOT (val)J&F91.1Unverified
9LCM (val)J&F90.7Unverified
10RPCMVOS (val)J&F90.6Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BLK30K, MS)Mean Jaccard & F-Measure89.5Unverified
2LCMF-measure86.5Unverified
3XMemMean Jaccard & F-Measure86.2Unverified
4BATMANMean Jaccard & F-Measure86.2Unverified
5STCNMean Jaccard & F-Measure85.4Unverified
6AOTMean Jaccard & F-Measure84.9Unverified
7STMF-measure84.3Unverified
8TransVOSMean Jaccard & F-Measure83.9Unverified
9RPCMVOSMean Jaccard & F-Measure83.7Unverified
10RMNMean Jaccard & F-Measure83.5Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BL30K, MS)Mean Jaccard & F-Measure86.9Unverified
2AOTMean Jaccard & F-Measure84.1Unverified
3RPCMVOSMean Jaccard & F-Measure84Unverified
4STCNMean Jaccard & F-Measure83Unverified
5CFBI+Mean Jaccard & F-Measure82.8Unverified
6RMNJaccard (Seen)82.1Unverified
7LCMMean Jaccard & F-Measure82Unverified
8TransVOSMean Jaccard & F-Measure81.8Unverified
9SSTMean Jaccard & F-Measure81.7Unverified
10LWLMean Jaccard & F-Measure81.5Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BL30K, MS)Mean Jaccard & F-Measure83.7Unverified
2XMemMean Jaccard & F-Measure81Unverified
3BATMANJaccard78.4Unverified
4AOTJaccard75.9Unverified
5RPCMVOSJaccard75.8Unverified
6LCMJaccard74.4Unverified
7KMNJaccard74.1Unverified
8TransVOSJaccard73Unverified
9STCNJaccard72.7Unverified
10RMNJaccard71.9Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BL30K,MS)Mean Jaccard & F-Measure86.8Unverified
2XMemMean Jaccard & F-Measure85.5Unverified
3BATMANMean Jaccard & F-Measure85Unverified
4AOTMean Jaccard & F-Measure84.1Unverified
5RPCMVOSMean Jaccard & F-Measure83.9Unverified
6MobileVOSMean Jaccard & F-Measure83.3Unverified
7STCNMean Jaccard & F-Measure82.7Unverified
8CFBI+Mean Jaccard & F-Measure82.6Unverified
9SSTMean Jaccard & F-Measure81.8Unverified
10CFBIMean Jaccard & F-Measure81Unverified
#ModelMetricClaimedVerifiedStatus
1AOC-MF (val)Jaccard (Mean)81.7Unverified
2ViTAE-T-StageJaccard (Mean)79.4Unverified
3DINO (ViT-B/8, ImageNet retrain)J&F71.4Unverified
4VOSwL (Mask+Language)mIoU59Unverified
5UniTrackmIoU58.4Unverified
#ModelMetricClaimedVerifiedStatus
1ReVOSAverage IOU75.6Unverified
2Cutie-baseAverage IOU74.6Unverified
3XMemAverage IOU70.4Unverified
4SAM 2Average IOU69.5Unverified
#ModelMetricClaimedVerifiedStatus
1DFNetF-Score82.3Unverified
2oursJaccard (Mean)76.7Unverified
#ModelMetricClaimedVerifiedStatus
1OursAverage74.9Unverified
2FEELVOSmIoU0.82Unverified
#ModelMetricClaimedVerifiedStatus
1LOCATEmIoU68.8Unverified
#ModelMetricClaimedVerifiedStatus
1CutieJ&F68.3Unverified
#ModelMetricClaimedVerifiedStatus
1LOCATEmIoU79.9Unverified