SOTAVerified

Referring Video Object Segmentation

Referring video object segmentation aims at segmenting an object in video with language expressions. Unlike the previous video object segmentation, the task exploits a different type of supervision, language expressions, to identify and segment an object referred by the given language expressions in a video.

Papers

Showing 150 of 74 papers

TitleStatusHype
4th PVUW MeViS 3rd Place Report: Sa2VACode5
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and VideosCode5
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video SegmentationCode5
LISA: Reasoning Segmentation via Large Language ModelCode4
Tracking Anything with Decoupled Video SegmentationCode3
Universal Instance Perception as Object Discovery and RetrievalCode3
UniVS: Unified and Universal Video Segmentation with Prompts as QueriesCode3
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video SegmentationCode3
VISA: Reasoning Video Object Segmentation via Large Language ModelsCode3
General Object Foundation Model for Images and Videos at ScaleCode3
Decoupling Static and Hierarchical Motion Perception for Referring Video SegmentationCode2
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object SegmentationCode2
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video SegmentationCode2
HyperSeg: Towards Universal Visual Segmentation with Large Language ModelCode2
Language as Queries for Referring Video Object SegmentationCode2
MeViS: A Large-scale Benchmark for Video Segmentation with Motion ExpressionsCode2
One Token to Seg Them All: Language Instructed Reasoning Segmentation in VideosCode2
The Devil is in Temporal Token: High Quality Video Reasoning SegmentationCode2
UniRef++: Segment Every Reference Object in Spatial and Temporal SpacesCode2
VideoMolmo: Spatio-Temporal Grounding Meets PointingCode2
VLT: Vision-Language Transformer and Query Generation for Referring SegmentationCode2
Towards Robust Referring Video Object Segmentation with Cyclic Relational ConsensusCode1
URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale BenchmarkCode1
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object SegmentationCode1
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object SegmentationCode1
Referring Video Object Segmentation via Language-aligned Track SelectionCode1
1st Place Solution for 5th LSVOS Challenge: Referring Video Object SegmentationCode1
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object SegmentationCode1
Local-Global Context Aware Transformer for Language-Guided Video SegmentationCode1
Tracking with Human-Intent ReasoningCode1
LoSh: Long-Short Text Joint Prediction Network for Referring Video Object SegmentationCode1
ActionVOS: Actions as Prompts for Video Object SegmentationCode1
1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video SegmentationCode1
Language-Bridged Spatial-Temporal Interaction for Referring Video Object SegmentationCode1
SOC: Semantic-Assisted Object Cluster for Referring Video Object SegmentationCode1
Spectrum-guided Multi-granularity Referring Video Object SegmentationCode1
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object SegmentationCode1
Multi-Attention Network for Compressed Video Referring Object SegmentationCode1
Temporally Consistent Referring Video Object Segmentation with Hybrid MemoryCode1
End-to-End Referring Video Object Segmentation with Multimodal TransformersCode1
1st Place Solution for YouTubeVOS Challenge 2022: Referring Video Object SegmentationCode1
OnlineRefer: A Simple Online Baseline for Referring Video Object SegmentationCode1
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation0
UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track0
InterRVOS: Interaction-aware Referring Video Object Segmentation0
Learning Referring Video Object Segmentation from Weak Annotation0
Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation0
LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation0
2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation0
Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.