MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation Nov 28, 2024 Data Augmentation Image Segmentation
Code Code Available 1MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding Apr 26, 2021 Generalized Referring Expression Comprehension Phrase Grounding
Code Code Available 1Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation Apr 6, 2022 Optical Flow Estimation Referring Expression Segmentation
Code Code Available 1MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation Jan 23, 2025 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 1Multi-Attention Network for Compressed Video Referring Object Segmentation Jul 26, 2022 Object Referring Expression Segmentation
Code Code Available 1Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts Nov 16, 2021 Cross-Modal Retrieval Image Captioning
Code Code Available 1Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation Mar 19, 2020 Generalized Referring Expression Comprehension Referring Expression
Code Code Available 1Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints Jan 12, 2025 Image Segmentation Referring Expression
Code Code Available 1OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene Grounding Mar 13, 2021 Referring Expression Referring Expression Segmentation
Code Code Available 1OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation Jul 18, 2023 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 1PhraseCut: Language-based Image Segmentation in the Wild Aug 3, 2020 Attribute Diversity
Code Code Available 1PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models? Feb 6, 2025 Question Answering Referring Expression
Code Code Available 1PolyFormer: Referring Image Segmentation as Sequential Polygon Generation Feb 14, 2023 Decoder Image Segmentation
Code Code Available 1Image Segmentation Using Text and Image Prompts Dec 18, 2021 Decoder Image Segmentation
Code Code Available 1Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus Jul 4, 2022 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 1Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation May 25, 2023 Object Referring Expression Segmentation
Code Code Available 1Referring Image Segmentation Using Text Supervision Aug 28, 2023 Image Segmentation Object Localization
Code Code Available 1Referring Image Segmentation via Cross-Modal Progressive Comprehension Oct 1, 2020 Attribute Image Segmentation
Code Code Available 1Referring Transformer: A One-step Approach to Multi-task Visual Grounding Jun 6, 2021 Decoder Referring Expression
Code Code Available 1RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation Oct 1, 2020 Image Segmentation Referring Expression Segmentation
Code Code Available 1RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation Dec 3, 2024 Referring Expression Referring Expression Segmentation
Code Code Available 1SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation Jun 3, 2024 Pseudo Label Referring Expression
Code Code Available 1SeqTR: A Simple yet Universal Network for Visual Grounding Mar 30, 2022 Decoder Referring Expression
Code Code Available 1SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation May 26, 2023 cross-modal alignment Object
Code Code Available 1Spectrum-guided Multi-granularity Referring Video Object Segmentation Jul 25, 2023 Object Referring Expression Segmentation
Code Code Available 1SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation Jun 8, 2021 Object object-detection
Code Code Available 1Temporally Consistent Referring Video Object Segmentation with Hybrid Memory Mar 28, 2024 HTR Object
Code Code Available 1Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation Dec 13, 2023 Descriptive Object
Code Code Available 1URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark Aug 1, 2020 Object One-shot visual object segmentation
Code Code Available 1ViLLa: Video Reasoning Segmentation with Large Language Model Jul 18, 2024 Image Segmentation Language Modeling
Code Code Available 1Vision-Language Transformer and Query Generation for Referring Segmentation Aug 12, 2021 Decoder Generalized Referring Expression Comprehension
Code Code Available 13D-GRES: Generalized 3D Referring Expression Segmentation Jul 30, 2024 Object Referring Expression
Code Code Available 1Actor and Action Modular Network for Text-based Video Segmentation Nov 2, 2020 Action Segmentation Action Understanding
— Unverified 0Fully and Weakly Supervised Referring Expression Segmentation with End-to-End Learning Dec 17, 2022 Position Referring Expression
— Unverified 0See-Through-Text Grouping for Referring Image Segmentation Oct 1, 2019 Image Segmentation object-detection
— Unverified 0SegLLM: Multi-round Reasoning Segmentation Oct 24, 2024 Reasoning Segmentation Referring Expression
— Unverified 0Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects Dec 8, 2023 Image Captioning object-detection
— Unverified 0MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation Nov 21, 2021 Decoder Image Segmentation
— Unverified 0Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval Jun 28, 2025 Cross-Modal Retrieval Image Captioning
— Unverified 0Segment Every Reference Object in Spatial and Temporal Spaces Jan 1, 2023 Image Segmentation Object
— Unverified 0Context Modulated Dynamic Networks for Actor and Action Video Segmentation with Language Queries Apr 3, 2020 Referring Expression Segmentation Video Segmentation
— Unverified 0Meta Compositional Referring Expression Segmentation Apr 10, 2023 Meta-Learning Referring Expression
— Unverified 03DResT: A Strong Baseline for Semi-Supervised 3D Referring Expression Segmentation Apr 17, 2025 Referring Expression Referring Expression Segmentation
— Unverified 0Video Object Segmentation with Language Referring Expressions Mar 21, 2018 Object Referring Expression Segmentation
— Unverified 0Weakly-supervised segmentation of referring expressions May 10, 2022 Image Segmentation Referring Expression
— Unverified 0Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation Jan 1, 2022 Object Referring Expression Segmentation
— Unverified 0WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation Jun 19, 2023 cross-modal alignment Image Segmentation
— Unverified 0Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding Jan 1, 2025 Referring Expression Referring Expression Comprehension
— Unverified 0Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation May 14, 2021 Decoder feature selection
— Unverified 0CLIPUNetr: Assisting Human-robot Interface for Uncalibrated Visual Servoing Control with CLIP-driven Referring Expression Segmentation Sep 17, 2023 Decoder Referring Expression
— Unverified 0