| Scene-Text Oriented Reffering Expression Comprehension | Nov 4, 2022 | Object LocalizationReferring Expression | CodeCode Available | 0 |
| Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset | Oct 10, 2022 | FormReferring Expression | —Unverified | 0 |
| Video Referring Expression Comprehension via Transformer with Content-aware Query | Oct 6, 2022 | cross-modal alignmentReferring Expression | —Unverified | 0 |
| Enhancing Interpretability and Interactivity in Robot Manipulation: A Neurosymbolic Approach | Oct 3, 2022 | Referring ExpressionRobot Manipulation | CodeCode Available | 0 |
| Exploring Modulated Detection Transformer as a Tool for Action Recognition in Videos | Sep 21, 2022 | Action DetectionAction Recognition | CodeCode Available | 0 |
| One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning | Jul 31, 2022 | AllReferring Expression | —Unverified | 0 |
| Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding | Jul 18, 2022 | AttributeReferring Expression | CodeCode Available | 0 |
| Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks | Jun 17, 2022 | Depth EstimationImage Generation | —Unverified | 0 |
| RefCrowd: Grounding the Target in Crowd with Referring Expressions | Jun 16, 2022 | AttributeReferring Expression | —Unverified | 0 |
| Constructing Distributions of Variation in Referring Expression Type from Corpora for Model Evaluation | Jun 1, 2022 | Referring Expression | —Unverified | 0 |
| Referring Expressions with Rational Speech Act Framework: A Probabilistic Approach | May 16, 2022 | Deep LearningReferring Expression | —Unverified | 0 |
| Weakly-supervised segmentation of referring expressions | May 10, 2022 | Image SegmentationReferring Expression | —Unverified | 0 |
| HOLM: Hallucinating Objects with Language Models for Referring Expression Recognition in Partially-Observed Scenes | May 1, 2022 | Referring Expression | —Unverified | 0 |
| Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension | Apr 21, 2022 | DiversityInformativeness | —Unverified | 0 |
| FindIt: Generalized Localization with Natural Language Queries | Mar 31, 2022 | Natural Language QueriesObject | —Unverified | 0 |
| Single-Stream Multi-Level Alignment for Vision-Language Pretraining | Mar 27, 2022 | Image-text RetrievalQuestion Answering | CodeCode Available | 0 |
| Non-neural Models Matter: A Re-evaluation of Neural Referring Expression Generation Systems | Mar 15, 2022 | BIG-bench Machine LearningReferring Expression | —Unverified | 0 |
| Differentiated Relevances Embedding for Group-based Referring Expression Comprehension | Mar 12, 2022 | AttributeObject | —Unverified | 0 |
| OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework | Feb 7, 2022 | Image Captioningimage-classification | CodeCode Available | 0 |
| Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching | Jan 18, 2022 | Image-text matchingReferring Expression | —Unverified | 0 |
| Lite-MDETR: A Lightweight Multi-Modal Detector | Jan 1, 2022 | object-detectionObject Detection | —Unverified | 0 |
| Deconfounded Visual Grounding | Dec 31, 2021 | Referring ExpressionVisual Grounding | CodeCode Available | 0 |
| Robust Visual Reasoning via Language Guided Neural Module Networks | Dec 1, 2021 | Question AnsweringReferring Expression | —Unverified | 0 |
| Using Referring Expression Generation to Model Literary Style | Dec 1, 2021 | modelReferring Expression | —Unverified | 0 |
| ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension | Nov 16, 2021 | image-classificationImage Classification | —Unverified | 0 |
| The Pipeline Model for Resolution of Anaphoric Reference and Resolution of Entity Reference | Nov 1, 2021 | coreference-resolutionCoreference Resolution | —Unverified | 0 |
| Evaluating and Improving Interactions with Hazy Oracles | Oct 19, 2021 | Object TrackingReferring Expression | —Unverified | 0 |
| Towards Language-guided Visual Recognition via Dynamic Convolutions | Oct 17, 2021 | Question AnsweringReferring Expression | CodeCode Available | 0 |
| Decoupling Pragmatics: Discriminative Decoding for Referring Expression Generation | Oct 1, 2021 | DiversityImage Captioning | —Unverified | 0 |
| Does referent predictability affect the choice of referential form? A computational approach using masked coreference resolution | Sep 27, 2021 | coreference-resolutionCoreference Resolution | CodeCode Available | 0 |
| Goal-driven text descriptions for images | Aug 28, 2021 | AI AgentCaption Generation | —Unverified | 0 |
| What can Neural Referential Form Selectors Learn? | Aug 15, 2021 | FormPosition | —Unverified | 0 |
| Enriching the E2E dataset | Aug 1, 2021 | Referring ExpressionReferring expression generation | CodeCode Available | 0 |
| VLN BERT: A Recurrent Vision-and-Language BERT for Navigation | Jun 19, 2021 | Decision MakingDecoder | —Unverified | 0 |
| Bridging the Gap Between Object Detection and User Intent via Query-Modulation | Jun 18, 2021 | Objectobject-detection | —Unverified | 0 |
| Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations? | Jun 8, 2021 | Referring ExpressionSelf-Driving Cars | CodeCode Available | 0 |
| Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation | May 24, 2021 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 |
| VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching | May 12, 2021 | Image-text matchingReferring Expression | —Unverified | 0 |
| Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention | May 5, 2021 | Question AnsweringReferring Expression | —Unverified | 0 |
| Playing Lottery Tickets with Vision and Language | Apr 23, 2021 | Image-text RetrievalQuestion Answering | —Unverified | 0 |
| Understanding Synonymous Referring Expressions via Contrastive Features | Apr 20, 2021 | ObjectReferring Expression | CodeCode Available | 0 |
| Perspective-corrected Spatial Referring Expression Generation for Human-Robot Interaction | Apr 4, 2021 | DiversityReferring Expression | —Unverified | 0 |
| Scene-Intuitive Agent for Remote Embodied Visual Grounding | Mar 24, 2021 | cross-modal alignmentNavigate | —Unverified | 0 |
| Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos | Mar 23, 2021 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| Referring Segmentation in Images and Videos with Cross-Modal Self-Attention Network | Feb 9, 2021 | Referring ExpressionReferring Expression Segmentation | —Unverified | 0 |
| Visual Question Answering based on Local-Scene-Aware Referring Expression Generation | Jan 22, 2021 | Question AnsweringReferring Expression | —Unverified | 0 |
| Language Controls More Than Top-Down Attention: Modulating Bottom-Up Visual Processing with Referring Expressions | Jan 1, 2021 | Referring Expression | —Unverified | 0 |
| Language-Mediated, Object-Centric Representation Learning | Dec 31, 2020 | ObjectObject Discovery | —Unverified | 0 |
| PPGN: Phrase-Guided Proposal Generation Network For Referring Expression Comprehension | Dec 20, 2020 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 |
| CoNAN: A Complementary Neighboring-based Attention Network for Referring Expression Generation | Dec 1, 2020 | ObjectReferring Expression | —Unverified | 0 |