| Scene-Intuitive Agent for Remote Embodied Visual Grounding | Mar 24, 2021 | cross-modal alignmentNavigate | —Unverified | 0 |
| See-Through-Text Grouping for Referring Image Segmentation | Oct 1, 2019 | Image Segmentationobject-detection | —Unverified | 0 |
| SegLLM: Multi-round Reasoning Segmentation | Oct 24, 2024 | Reasoning SegmentationReferring Expression | —Unverified | 0 |
| Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO | Jun 27, 2024 | Image SegmentationMedical Image Segmentation | —Unverified | 0 |
| Semi-automatic definite description annotation: a first report | Dec 24, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| SemScribe: Natural Language Generation for Medical Reports | May 1, 2012 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Specificity measures and reference | Sep 30, 2018 | Referring ExpressionSpecificity | —Unverified | 0 |
| Squib: Effects of Cognitive Effort on the Resolution of Overspecified Descriptions | Jun 1, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Statistical NLG for Generating the Content and Form of Referring Expressions | Nov 1, 2018 | AttributeForm | —Unverified | 0 |
| SUGAR: Pre-training 3D Visual Representations for Robotics | Apr 1, 2024 | 3D Instance Segmentation3D Object Recognition | —Unverified | 0 |
| Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input | Jun 25, 2023 | DiversityImage-text Retrieval | —Unverified | 0 |
| Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks | Jul 14, 2023 | ObjectReferring Expression | —Unverified | 0 |
| Synthetic Visual Genome | Jun 9, 2025 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding | Jan 1, 2025 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| Text Augmented Spatial-aware Zero-shot Referring Image Segmentation | Oct 27, 2023 | Image SegmentationReferring Expression | —Unverified | 0 |
| Text-driven Affordance Learning from Egocentric Vision | Apr 3, 2024 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| The Methodius Corpus of Rhetorical Discourse Structures and Generated Texts | May 1, 2016 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| The Pipeline Model for Resolution of Anaphoric Reference and Resolution of Entity Reference | Nov 1, 2021 | coreference-resolutionCoreference Resolution | —Unverified | 0 |
| The Solution for the 5th GCAIAC Zero-shot Referring Expression Comprehension Challenge | Jul 6, 2024 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| The WebNLG Challenge: Generating Text from RDF Data | Sep 1, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Toward Forgetting-Sensitive Referring Expression Generationfor Integrated Robot Architectures | Jul 16, 2020 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Towards Situated Dialogue: Revisiting Referring Expression Generation | Oct 1, 2013 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Trainable Referring Expression Generation using Overspecification Preferences | Apr 12, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Transcrib3D: 3D Referring Expression Resolution through Large Language Models | Apr 30, 2024 | Referring Expression | —Unverified | 0 |
| Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks | Jun 17, 2022 | Depth EstimationImage Generation | —Unverified | 0 |
| UNITER: Learning UNiversal Image-TExt Representations | Sep 25, 2019 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching | Jan 18, 2022 | Image-text matchingReferring Expression | —Unverified | 0 |
| Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos | Mar 7, 2017 | Referring Expression | —Unverified | 0 |
| Using Lexical Alignment and Referring Ability to Address Data Sparsity in Situated Dialog Reference Resolution | Oct 1, 2018 | Referring Expression | —Unverified | 0 |
| Using Referring Expression Generation to Model Literary Style | Dec 1, 2021 | modelReferring Expression | —Unverified | 0 |
| Utilizing Every Image Object for Semi-supervised Phrase Grounding | Nov 5, 2020 | Phrase GroundingReferring Expression | —Unverified | 0 |
| Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions | Jul 8, 2019 | Multiple Instance LearningReferring Expression | —Unverified | 0 |
| Video Referring Expression Comprehension via Transformer with Content-aware Query | Oct 6, 2022 | cross-modal alignmentReferring Expression | —Unverified | 0 |
| Video Referring Expression Comprehension via Transformer with Content-conditioned Query | Oct 25, 2023 | cross-modal alignmentReferring Expression | —Unverified | 0 |
| Viewpoint-Aware Visual Grounding in 3D Scenes | Jan 1, 2024 | 3D visual groundingReferring Expression | —Unverified | 0 |
| Visual Question Answering based on Local-Scene-Aware Referring Expression Generation | Jan 22, 2021 | Question AnsweringReferring Expression | —Unverified | 0 |
| VLN BERT: A Recurrent Vision-and-Language BERT for Navigation | Jun 19, 2021 | Decision MakingDecoder | —Unverified | 0 |
| VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching | May 12, 2021 | Image-text matchingReferring Expression | —Unverified | 0 |
| VQD: Visual Query Detection in Natural Scenes | Apr 4, 2019 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar | Mar 19, 2024 | Autonomous NavigationReferring Expression | —Unverified | 0 |
| Weakly-supervised segmentation of referring expressions | May 10, 2022 | Image SegmentationReferring Expression | —Unverified | 0 |
| What can Neural Referential Form Selectors Learn? | Aug 15, 2021 | FormPosition | —Unverified | 0 |
| ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension | Nov 16, 2021 | image-classificationImage Classification | —Unverified | 0 |
| Recurrent Instance Segmentation using Sequences of Referring Expressions | Nov 5, 2019 | Instance SegmentationReferring Expression | —Unverified | 0 |
| RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension | Jan 1, 2023 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| RefCrowd: Grounding the Target in Crowd with Referring Expressions | Jun 16, 2022 | AttributeReferring Expression | —Unverified | 0 |
| RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions | Jun 3, 2025 | Referring ExpressionSynthetic Data Generation | —Unverified | 0 |
| Reference production in human-computer interaction: Issues for Corpus-based Referring Expression Generation | May 1, 2018 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Refer-iTTS: A System for Referring in Spoken Installments to Objects in Real-World Images | Sep 1, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 |
| Reasoning About Pragmatics with Neural Listeners and Speakers | Apr 2, 2016 | Referring ExpressionText Generation | CodeCode Available | 0 |