SOTAVerified

Video Object Segmentation

Video object segmentation is a binary labeling problem aiming to separate foreground object(s) from the background region of a video.

For leaderboards please refer to the different subtasks.

Papers

Showing 2650 of 551 papers

TitleStatusHype
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object SegmentationCode2
VideoMolmo: Spatio-Temporal Grounding Meets PointingCode2
UniRef++: Segment Every Reference Object in Spatial and Temporal SpacesCode2
VLT: Vision-Language Transformer and Query Generation for Referring SegmentationCode2
Efficient Video Object Segmentation via Modulated Cross-Attention MemoryCode2
Tracking Anything in High QualityCode2
Video Object Segmentation in Panoptic Wild ScenesCode2
Scalable Video Object Segmentation with Identification MechanismCode2
One Token to Seg Them All: Language Instructed Reasoning Segmentation in VideosCode2
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System CollaborationCode2
Language as Queries for Referring Video Object SegmentationCode2
In Defense of Online Models for Video Instance SegmentationCode2
LVOS: A Benchmark for Large-scale Long-term Video Object SegmentationCode2
HyperSeg: Towards Universal Visual Segmentation with Large Language ModelCode2
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video SegmentationCode2
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet VideosCode2
MeViS: A Large-scale Benchmark for Video Segmentation with Motion ExpressionsCode2
Fast Online Object Tracking and Segmentation: A Unifying ApproachCode2
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object SegmentationCode2
Decoupling Features in Hierarchical Propagation for Video Object SegmentationCode2
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object SegmentationCode2
MOSE: A New Dataset for Video Object Segmentation in Complex ScenesCode2
A Deeper Dive Into What Deep Spatiotemporal Networks Encode: Quantifying Static vs. Dynamic InformationCode1
Contrastive Transformation for Self-supervised Correspondence LearningCode1
Emerging Properties in Self-Supervised Vision TransformersCode1
Show:102550
← PrevPage 2 of 23Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AOC-MF (val)F-Score94.7Unverified
2ISVOS (BL30K, MS)J&F93.4Unverified
3XMem (BL30K, MS)J&F93.3Unverified
4BATMAN (val)J&F92.5Unverified
5STCN (val)J&F91.6Unverified
6XMemJ&F91.5Unverified
7MobileVOS (val)J&F91.4Unverified
8AOT (val)J&F91.1Unverified
9LCM (val)J&F90.7Unverified
10RPCMVOS (val)J&F90.6Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BLK30K, MS)Mean Jaccard & F-Measure89.5Unverified
2LCMF-measure86.5Unverified
3XMemMean Jaccard & F-Measure86.2Unverified
4BATMANMean Jaccard & F-Measure86.2Unverified
5STCNMean Jaccard & F-Measure85.4Unverified
6AOTMean Jaccard & F-Measure84.9Unverified
7STMF-measure84.3Unverified
8TransVOSMean Jaccard & F-Measure83.9Unverified
9RPCMVOSMean Jaccard & F-Measure83.7Unverified
10RMNMean Jaccard & F-Measure83.5Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BL30K, MS)Mean Jaccard & F-Measure86.9Unverified
2AOTMean Jaccard & F-Measure84.1Unverified
3RPCMVOSMean Jaccard & F-Measure84Unverified
4STCNMean Jaccard & F-Measure83Unverified
5CFBI+Mean Jaccard & F-Measure82.8Unverified
6RMNJaccard (Seen)82.1Unverified
7LCMMean Jaccard & F-Measure82Unverified
8TransVOSMean Jaccard & F-Measure81.8Unverified
9SSTMean Jaccard & F-Measure81.7Unverified
10LWLMean Jaccard & F-Measure81.5Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BL30K, MS)Mean Jaccard & F-Measure83.7Unverified
2XMemMean Jaccard & F-Measure81Unverified
3BATMANJaccard78.4Unverified
4AOTJaccard75.9Unverified
5RPCMVOSJaccard75.8Unverified
6LCMJaccard74.4Unverified
7KMNJaccard74.1Unverified
8TransVOSJaccard73Unverified
9STCNJaccard72.7Unverified
10RMNJaccard71.9Unverified
#ModelMetricClaimedVerifiedStatus
1XMem (BL30K,MS)Mean Jaccard & F-Measure86.8Unverified
2XMemMean Jaccard & F-Measure85.5Unverified
3BATMANMean Jaccard & F-Measure85Unverified
4AOTMean Jaccard & F-Measure84.1Unverified
5RPCMVOSMean Jaccard & F-Measure83.9Unverified
6MobileVOSMean Jaccard & F-Measure83.3Unverified
7STCNMean Jaccard & F-Measure82.7Unverified
8CFBI+Mean Jaccard & F-Measure82.6Unverified
9SSTMean Jaccard & F-Measure81.8Unverified
10CFBIMean Jaccard & F-Measure81Unverified
#ModelMetricClaimedVerifiedStatus
1AOC-MF (val)Jaccard (Mean)81.7Unverified
2ViTAE-T-StageJaccard (Mean)79.4Unverified
3DINO (ViT-B/8, ImageNet retrain)J&F71.4Unverified
4VOSwL (Mask+Language)mIoU59Unverified
5UniTrackmIoU58.4Unverified
#ModelMetricClaimedVerifiedStatus
1ReVOSAverage IOU75.6Unverified
2Cutie-baseAverage IOU74.6Unverified
3XMemAverage IOU70.4Unverified
4SAM 2Average IOU69.5Unverified
#ModelMetricClaimedVerifiedStatus
1DFNetF-Score82.3Unverified
2oursJaccard (Mean)76.7Unverified
#ModelMetricClaimedVerifiedStatus
1OursAverage74.9Unverified
2FEELVOSmIoU0.82Unverified
#ModelMetricClaimedVerifiedStatus
1LOCATEmIoU68.8Unverified
#ModelMetricClaimedVerifiedStatus
1CutieJ&F68.3Unverified
#ModelMetricClaimedVerifiedStatus
1LOCATEmIoU79.9Unverified