SOTAVerified

Referring Video Object Segmentation

Referring video object segmentation aims at segmenting an object in video with language expressions. Unlike the previous video object segmentation, the task exploits a different type of supervision, language expressions, to identify and segment an object referred by the given language expressions in a video.

Papers

Showing 5174 of 74 papers

TitleStatusHype
Spectrum-guided Multi-granularity Referring Video Object SegmentationCode1
OnlineRefer: A Simple Online Baseline for Referring Video Object SegmentationCode1
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object SegmentationCode1
Bidirectional Correlation-Driven Inter-Frame Interaction Transformer for Referring Video Object Segmentation0
LoSh: Long-Short Text Joint Prediction Network for Referring Video Object SegmentationCode1
SOC: Semantic-Assisted Object Cluster for Referring Video Object SegmentationCode1
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object SegmentationCode1
Universal Instance Perception as Object Discovery and RetrievalCode3
Robust Referring Video Object Segmentation with Cyclic Structural Consensus0
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation0
Segment Every Reference Object in Spatial and Temporal Spaces0
1st Place Solution for YouTubeVOS Challenge 2022: Referring Video Object SegmentationCode1
VLT: Vision-Language Transformer and Query Generation for Referring SegmentationCode2
Multi-Attention Network for Compressed Video Referring Object SegmentationCode1
Towards Robust Referring Video Object Segmentation with Cyclic Relational ConsensusCode1
The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation0
Language-Bridged Spatial-Temporal Interaction for Referring Video Object SegmentationCode1
Local-Global Context Aware Transformer for Language-Guided Video SegmentationCode1
Language as Queries for Referring Video Object SegmentationCode2
Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation0
End-to-End Referring Video Object Segmentation with Multimodal TransformersCode1
Rethinking Cross-modal Interaction from a Top-down Perspective for Referring Video Object Segmentation0
URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale BenchmarkCode1
Cross-Modal Self-Attention Network for Referring Image SegmentationCode0
Show:102550
← PrevPage 2 of 2Next →

No leaderboard results yet.