MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation Nov 28, 2024 Data Augmentation Image Segmentation
Code Code Available 15 MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding Apr 26, 2021 Generalized Referring Expression Comprehension Phrase Grounding
Code Code Available 15 Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation Apr 6, 2022 Optical Flow Estimation Referring Expression Segmentation
Code Code Available 15 MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation Jan 23, 2025 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 15 Multi-Attention Network for Compressed Video Referring Object Segmentation Jul 26, 2022 Object Referring Expression Segmentation
Code Code Available 15 Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts Nov 16, 2021 Cross-Modal Retrieval Image Captioning
Code Code Available 15 Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation Mar 19, 2020 Generalized Referring Expression Comprehension Referring Expression
Code Code Available 15 Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints Jan 12, 2025 Image Segmentation Referring Expression
Code Code Available 15 OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene Grounding Mar 13, 2021 Referring Expression Referring Expression Segmentation
Code Code Available 15 OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation Jul 18, 2023 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 15 PhraseCut: Language-based Image Segmentation in the Wild Aug 3, 2020 Attribute Diversity
Code Code Available 15 PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models? Feb 6, 2025 Question Answering Referring Expression
Code Code Available 15 PolyFormer: Referring Image Segmentation as Sequential Polygon Generation Feb 14, 2023 Decoder Image Segmentation
Code Code Available 15 Image Segmentation Using Text and Image Prompts Dec 18, 2021 Decoder Image Segmentation
Code Code Available 15 Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus Jul 4, 2022 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 15 Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation May 25, 2023 Object Referring Expression Segmentation
Code Code Available 15 Referring Image Segmentation Using Text Supervision Aug 28, 2023 Image Segmentation Object Localization
Code Code Available 15 Referring Image Segmentation via Cross-Modal Progressive Comprehension Oct 1, 2020 Attribute Image Segmentation
Code Code Available 15 Referring Transformer: A One-step Approach to Multi-task Visual Grounding Jun 6, 2021 Decoder Referring Expression
Code Code Available 15 RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation Oct 1, 2020 Image Segmentation Referring Expression Segmentation
Code Code Available 15 RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation Dec 3, 2024 Referring Expression Referring Expression Segmentation
Code Code Available 15 SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation Jun 3, 2024 Pseudo Label Referring Expression
Code Code Available 15 SeqTR: A Simple yet Universal Network for Visual Grounding Mar 30, 2022 Decoder Referring Expression
Code Code Available 15 SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation May 26, 2023 cross-modal alignment Object
Code Code Available 15 Spectrum-guided Multi-granularity Referring Video Object Segmentation Jul 25, 2023 Object Referring Expression Segmentation
Code Code Available 15 SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation Jun 8, 2021 Object object-detection
Code Code Available 15 Temporally Consistent Referring Video Object Segmentation with Hybrid Memory Mar 28, 2024 HTR Object
Code Code Available 15 Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation Dec 13, 2023 Descriptive Object
Code Code Available 15 URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark Aug 1, 2020 Object One-shot visual object segmentation
Code Code Available 15 ViLLa: Video Reasoning Segmentation with Large Language Model Jul 18, 2024 Image Segmentation Language Modeling
Code Code Available 15 Vision-Language Transformer and Query Generation for Referring Segmentation Aug 12, 2021 Decoder Generalized Referring Expression Comprehension
Code Code Available 15 3D-GRES: Generalized 3D Referring Expression Segmentation Jul 30, 2024 Object Referring Expression
Code Code Available 15 Segmentation from Natural Language Expressions Mar 20, 2016 Referring Expression Segmentation Segmentation
Code Code Available 05 CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions Jan 3, 2019 Diagnostic Image Segmentation
Code Code Available 05 MAttNet: Modular Attention Network for Referring Expression Comprehension Jan 24, 2018 Generalized Referring Expression Segmentation Referring Expression
Code Code Available 05 InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation Nov 30, 2023 Image Captioning Referring Expression
Code Code Available 05 InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling Jan 21, 2025 Object Tracking Referring Expression Segmentation
Code Code Available 05 Asymmetric Cross-Guided Attention Network for Actor and Action Video Segmentation From Natural Language Query Oct 1, 2019 Referring Expression Segmentation Segmentation
Code Code Available 05 Comprehensive Multi-Modal Interactions for Referring Image Segmentation Apr 21, 2021 Image Segmentation Referring Expression Segmentation
Code Code Available 05 Referring Expression Object Segmentation with Caption-Aware Consistency Oct 10, 2019 Caption Generation Object
Code Code Available 05 Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities Apr 2, 2025 Descriptive Large Language Model
Code Code Available 05 Exploring Modulated Detection Transformer as a Tool for Action Recognition in Videos Sep 21, 2022 Action Detection Action Recognition
Code Code Available 05 Learning To Segment Every Referring Object Point by Point Jan 1, 2023 Object Referring Expression
Code Code Available 05 Cross-Modal Self-Attention Network for Referring Image Segmentation Apr 9, 2019 Image Segmentation Referring Expression
Code Code Available 05 Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation May 24, 2024 Decoder Generalized Referring Expression Segmentation
Code Code Available 05 Expression Prompt Collaboration Transformer for Universal Referring Video Object Segmentation Aug 8, 2023 Contrastive Learning Object
Code Code Available 05 Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters Mar 28, 2020 Colorization Image Colorization
Code Code Available 05 Referring Image Segmentation via Recurrent Refinement Networks Jun 1, 2018 Image Segmentation Referring Expression Segmentation
Code Code Available 05 Towards Omni-supervised Referring Expression Segmentation Nov 1, 2023 Referring Expression Referring Expression Segmentation
Code Code Available 05 Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding Apr 12, 2024 Decoder Image Segmentation
Code Code Available 05