VideoMolmo: Spatio-Temporal Grounding Meets Pointing Jun 5, 2025 Autonomous Driving Autonomous Navigation
Code Code Available 2InterRVOS: Interaction-aware Referring Video Object Segmentation Jun 3, 2025 8k Object
— Unverified 0Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation May 19, 2025 Referring Video Object Segmentation Semantic Segmentation
— Unverified 0Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching Apr 18, 2025 Object Referring Video Object Segmentation
Code Code Available 0GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation Apr 10, 2025 Contrastive Learning Language Modeling
Code Code Available 2The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation Apr 7, 2025 Inference Optimization Referring Video Object Segmentation
Code Code Available 54th PVUW MeViS 3rd Place Report: Sa2VA Apr 1, 2025 Language Modeling Language Modelling
Code Code Available 5ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025 Mar 30, 2025 Object Referring Video Object Segmentation
Code Code Available 0Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation Mar 5, 2025 Object Referring Video Object Segmentation
Code Code Available 2ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations Jan 24, 2025 Decoder Object
— Unverified 0MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation Jan 23, 2025 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 1InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling Jan 21, 2025 Object Tracking Referring Expression Segmentation
Code Code Available 0The Devil is in Temporal Token: High Quality Video Reasoning Segmentation Jan 15, 2025 Reasoning Segmentation Referring Expression Segmentation
Code Code Available 2Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation Jan 9, 2025 Referring Video Object Segmentation Semantic Segmentation
Code Code Available 0Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Jan 7, 2025 2k Language Modeling
Code Code Available 5DTOS: Dynamic Time Object Sensing with Large Multimodal Model Jan 1, 2025 Moment Retrieval Referring Video Object Segmentation
Code Code Available 0Semantic and Sequential Alignment for Referring Video Object Segmentation Jan 1, 2025 Instance Segmentation Referring Video Object Segmentation
— Unverified 0Decoupled Motion Expression Video Segmentation Jan 1, 2025 Instance Segmentation Referring Video Object Segmentation
— Unverified 0Referring Video Object Segmentation via Language-aligned Track Selection Dec 2, 2024 Object Object Tracking
Code Code Available 1SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation Nov 26, 2024 Natural Language Understanding Referring Video Object Segmentation
Code Code Available 3HyperSeg: Towards Universal Visual Segmentation with Large Language Model Nov 26, 2024 Language Modeling Large Language Model
Code Code Available 2One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos Sep 29, 2024 All Image Segmentation
Code Code Available 2LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation Sep 9, 2024 Object Referring Video Object Segmentation
— Unverified 0The 2nd Solution for LSVOS Challenge RVOS Track: Spatial-temporal Refinement for Consistent Semantic Segmentation Aug 22, 2024 Referring Video Object Segmentation Segmentation
— Unverified 0The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution Aug 20, 2024 Referring Video Object Segmentation Retrieval
— Unverified 0UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track Aug 19, 2024 Referring Video Object Segmentation Semantic Segmentation
— Unverified 0VISA: Reasoning Video Object Segmentation via Large Language Models Jul 16, 2024 Decoder Object
Code Code Available 3ActionVOS: Actions as Prompts for Video Object Segmentation Jul 10, 2024 Object Referring Video Object Segmentation
Code Code Available 12nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation Jun 20, 2024 Instance Segmentation Referring Video Object Segmentation
— Unverified 0GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation Jun 18, 2024 Contrastive Learning Object
— Unverified 01st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation Jun 11, 2024 Referring Video Object Segmentation Segmentation
Code Code Available 13rd Place Solution for MeViS Track in CVPR 2024 PVUW workshop: Motion Expression guided Video Segmentation Jun 7, 2024 Referring Video Object Segmentation Semantic Segmentation
— Unverified 0Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation May 17, 2024 Referring Expression Segmentation Referring Video Object Segmentation
— Unverified 0Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding Apr 12, 2024 Decoder Image Segmentation
Code Code Available 0Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation Apr 4, 2024 Contrastive Learning Referring Expression
Code Code Available 2Temporally Consistent Referring Video Object Segmentation with Hybrid Memory Mar 28, 2024 HTR Object
Code Code Available 1Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation Mar 18, 2024 Referring Video Object Segmentation Semantic Segmentation
Code Code Available 1UniVS: Unified and Universal Video Segmentation with Prompts as Queries Feb 28, 2024 Decoder Referring Expression Segmentation
Code Code Available 31st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation Jan 1, 2024 Object Referring Video Object Segmentation
Code Code Available 1Tracking with Human-Intent Reasoning Dec 29, 2023 Language Modelling Object
Code Code Available 1UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces Dec 25, 2023 Image Segmentation Object
Code Code Available 2General Object Foundation Model for Images and Videos at Scale Dec 14, 2023 Instance Segmentation Long-tail Video Object Segmentation
Code Code Available 3Fully Transformer-Equipped Architecture for End-to-End Referring Video Object Segmentation Sep 21, 2023 Object Referring Video Object Segmentation
— Unverified 0Temporal Collection and Distribution for Referring Video Object Segmentation Sep 7, 2023 Object Referring Video Object Segmentation
— Unverified 0Tracking Anything with Decoupled Video Segmentation Sep 7, 2023 Open-Vocabulary Video Segmentation Open-World Video Segmentation
Code Code Available 3Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples Sep 5, 2023 Referring Video Object Segmentation Semantic Segmentation
Code Code Available 0MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions Aug 16, 2023 Motion Expressions Guided Video Segmentation Object
Code Code Available 2Expression Prompt Collaboration Transformer for Universal Referring Video Object Segmentation Aug 8, 2023 Contrastive Learning Object
Code Code Available 0Learning Referring Video Object Segmentation from Weak Annotation Aug 4, 2023 Contrastive Learning Object
— Unverified 0LISA: Reasoning Segmentation via Large Language Model Aug 1, 2023 Language Modeling Language Modelling
Code Code Available 4