DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy Jul 2, 2025 Data Augmentation Generalized Referring Expression Segmentation
Code Code Available 1Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval Jun 28, 2025 Cross-Modal Retrieval Image Captioning
— Unverified 0Refer to Anything with Vision-Language Prompts Jun 5, 2025 Benchmarking Generalized Referring Expression Segmentation
— Unverified 0RemoteSAM: Towards Segment Anything for Earth Observation May 23, 2025 Attribute Earth Observation
Code Code Available 3VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning May 17, 2025 2D Object Detection Object Counting
Code Code Available 4RESAnything: Attribute Prompting for Arbitrary Referring Segmentation May 3, 2025 Attribute Image Segmentation
— Unverified 03DResT: A Strong Baseline for Semi-Supervised 3D Referring Expression Segmentation Apr 17, 2025 Referring Expression Referring Expression Segmentation
— Unverified 0Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities Apr 2, 2025 Descriptive Large Language Model
Code Code Available 0GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding Mar 13, 2025 Diversity Language Modeling
Code Code Available 2SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories Mar 11, 2025 Decision Making Interactive Segmentation
Code Code Available 2Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement Mar 9, 2025 Domain Generalization Object Detection
Code Code Available 4PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models? Feb 6, 2025 Question Answering Referring Expression
Code Code Available 1ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations Jan 24, 2025 Decoder Object
— Unverified 0MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation Jan 23, 2025 Referring Expression Segmentation Referring Video Object Segmentation
Code Code Available 1InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling Jan 21, 2025 Object Tracking Referring Expression Segmentation
Code Code Available 0Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation Jan 15, 2025 Image Segmentation Referring Expression Segmentation
Code Code Available 2The Devil is in Temporal Token: High Quality Video Reasoning Segmentation Jan 15, 2025 Reasoning Segmentation Referring Expression Segmentation
Code Code Available 2Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints Jan 12, 2025 Image Segmentation Referring Expression
Code Code Available 1IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation Jan 9, 2025 Decoder Referring Expression
Code Code Available 1Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension Jan 2, 2025 Generalized Referring Expression Comprehension Generalized Referring Expression Segmentation
— Unverified 0DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension Jan 1, 2025 Descriptive Referring Expression
— Unverified 0Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding Jan 1, 2025 Referring Expression Referring Expression Comprehension
— Unverified 0RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation Dec 3, 2024 Referring Expression Referring Expression Segmentation
Code Code Available 1MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation Nov 28, 2024 Data Augmentation Image Segmentation
Code Code Available 1HyperSeg: Towards Universal Visual Segmentation with Large Language Model Nov 26, 2024 Language Modeling Large Language Model
Code Code Available 2Instance-Aware Generalized Referring Expression Segmentation Nov 22, 2024 Generalized Referring Expression Segmentation Object
— Unverified 0SegLLM: Multi-round Reasoning Segmentation Oct 24, 2024 Reasoning Segmentation Referring Expression
— Unverified 0Text4Seg: Reimagining Image Segmentation as Text Generation Oct 13, 2024 Image Segmentation Referring Expression
Code Code Available 2SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation Sep 1, 2024 Language Modeling Language Modelling
Code Code Available 23D-GRES: Generalized 3D Referring Expression Segmentation Jul 30, 2024 Object Referring Expression
Code Code Available 1Multi-label Cluster Discrimination for Visual Representation Learning Jul 24, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 4ViLLa: Video Reasoning Segmentation with Large Language Model Jul 18, 2024 Image Segmentation Language Modeling
Code Code Available 1SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation Jul 2, 2024 Referring Expression Referring Expression Segmentation
— Unverified 0EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model Jun 28, 2024 Interactive Segmentation Language Modeling
Code Code Available 3GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation Jun 18, 2024 Contrastive Learning Object
— Unverified 0F-LMM: Grounding Frozen Large Multimodal Models Jun 9, 2024 General Knowledge Instruction Following
Code Code Available 2SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation Jun 3, 2024 Pseudo Label Referring Expression
Code Code Available 1GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane May 27, 2024 3DGS feature selection
— Unverified 0Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation May 24, 2024 Decoder Generalized Referring Expression Segmentation
Code Code Available 0CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation May 24, 2024 Generalized Referring Expression Segmentation Object
Code Code Available 1Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation May 17, 2024 Referring Expression Segmentation Referring Video Object Segmentation
— Unverified 0Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding Apr 12, 2024 Decoder Image Segmentation
Code Code Available 0Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation Apr 4, 2024 Contrastive Learning Referring Expression
Code Code Available 2Temporally Consistent Referring Video Object Segmentation with Hybrid Memory Mar 28, 2024 HTR Object
Code Code Available 1PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model Mar 21, 2024 Decoder Generalized Referring Expression Segmentation
Code Code Available 3UniVS: Unified and Universal Video Segmentation with Prompts as Queries Feb 28, 2024 Decoder Referring Expression Segmentation
Code Code Available 3GROUNDHOG: Grounding Large Language Models to Holistic Segmentation Feb 26, 2024 Causal Language Modeling Generalized Referring Expression Segmentation
— Unverified 0RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner Feb 8, 2024 Image Segmentation Pseudo Label
— Unverified 0Generalizable Entity Grounding via Assistance of Large Language Model Feb 4, 2024 Language Modeling Language Modelling
— Unverified 0Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation Jan 1, 2024 Descriptive Object
Code Code Available 2