| RefCrowd: Grounding the Target in Crowd with Referring Expressions | Jun 16, 2022 | AttributeReferring Expression | —Unverified | 0 | 0 |
| WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar | Mar 19, 2024 | Autonomous NavigationReferring Expression | —Unverified | 0 | 0 |
| Referring Expression Comprehension: A Survey of Methods and Datasets | Jul 19, 2020 | object-detectionObject Detection | —Unverified | 0 | 0 |
| Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension | Apr 21, 2022 | DiversityInformativeness | —Unverified | 0 | 0 |
| Referring Expression Instance Retrieval and A Strong End-to-End Baseline | Jun 23, 2025 | Image RetrievalReferring Expression | —Unverified | 0 | 0 |
| Video Referring Expression Comprehension via Transformer with Content-aware Query | Oct 6, 2022 | cross-modal alignmentReferring Expression | —Unverified | 0 | 0 |
| RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension | Jan 1, 2023 | Imitation LearningPseudo Label | —Unverified | 0 | 0 |
| Revisiting Multi-Modal LLM Evaluation | Aug 9, 2024 | Chart UnderstandingOptical Character Recognition | —Unverified | 0 | 0 |
| ScanFormer: Referring Expression Comprehension by Iteratively Scanning | Jun 26, 2024 | InformativenessReferring Expression | —Unverified | 0 | 0 |
| Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO | Jun 27, 2024 | Image SegmentationMedical Image Segmentation | —Unverified | 0 | 0 |
| Beyond Object Categories: Multi-Attribute Reference Understanding for Visual Grounding | Mar 25, 2025 | AttributeObject | —Unverified | 0 | 0 |
| Dynamic Graph Attention for Referring Expression Comprehension | Sep 18, 2019 | Graph AttentionReferring Expression | —Unverified | 0 | 0 |
| Dynamic Inference With Grounding Based Vision and Language Models | Jan 1, 2023 | Language ModellingReferring Expression | —Unverified | 0 | 0 |
| DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension | Jan 1, 2025 | DescriptiveReferring Expression | —Unverified | 0 | 0 |
| Differentiated Relevances Embedding for Group-based Referring Expression Comprehension | Mar 12, 2022 | AttributeObject | —Unverified | 0 | 0 |
| ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph | Jun 30, 2020 | AttributePrediction | —Unverified | 0 | 0 |
| Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input | Jun 25, 2023 | DiversityImage-text Retrieval | —Unverified | 0 | 0 |
| Exploring Spatial Language Grounding Through Referring Expressions | Feb 4, 2025 | Image CaptioningNegation | —Unverified | 0 | 0 |
| FindIt: Generalized Localization with Natural Language Queries | Mar 31, 2022 | Natural Language QueriesObject | —Unverified | 0 | 0 |
| Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks | Jul 14, 2023 | ObjectReferring Expression | —Unverified | 0 | 0 |
| FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis | Jan 17, 2025 | Bayesian InferenceLanguage Modeling | —Unverified | 0 | 0 |
| Deep Fragment Embeddings for Bidirectional Image Sentence Mapping | Jun 22, 2014 | Referring Expression ComprehensionRetrieval | —Unverified | 0 | 0 |
| CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding | Nov 6, 2023 | CoLAQuestion Answering | —Unverified | 0 | 0 |
| Synthetic Visual Genome | Jun 9, 2025 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 | 0 |
| GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing | Mar 16, 2025 | Change DetectionImage Captioning | —Unverified | 0 | 0 |