SOTAVerified

Referring Video Object Segmentation

Referring video object segmentation aims at segmenting an object in video with language expressions. Unlike the previous video object segmentation, the task exploits a different type of supervision, language expressions, to identify and segment an object referred by the given language expressions in a video.

Papers

Showing 110 of 74 papers

TitleStatusHype
VideoMolmo: Spatio-Temporal Grounding Meets PointingCode2
InterRVOS: Interaction-aware Referring Video Object Segmentation0
Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation0
Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence MatchingCode0
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video SegmentationCode2
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video SegmentationCode5
4th PVUW MeViS 3rd Place Report: Sa2VACode5
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025Code0
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object SegmentationCode2
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations0
Show:102550
← PrevPage 1 of 8Next →

No leaderboard results yet.