| Beyond Object Categories: Multi-Attribute Reference Understanding for Visual Grounding | Mar 25, 2025 | AttributeObject | —Unverified | 0 | 0 |
| Dynamic Graph Attention for Referring Expression Comprehension | Sep 18, 2019 | Graph AttentionReferring Expression | —Unverified | 0 | 0 |
| Dynamic Inference With Grounding Based Vision and Language Models | Jan 1, 2023 | Language ModellingReferring Expression | —Unverified | 0 | 0 |
| DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension | Jan 1, 2025 | DescriptiveReferring Expression | —Unverified | 0 | 0 |
| Differentiated Relevances Embedding for Group-based Referring Expression Comprehension | Mar 12, 2022 | AttributeObject | —Unverified | 0 | 0 |
| ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph | Jun 30, 2020 | AttributePrediction | —Unverified | 0 | 0 |
| Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input | Jun 25, 2023 | DiversityImage-text Retrieval | —Unverified | 0 | 0 |
| Exploring Spatial Language Grounding Through Referring Expressions | Feb 4, 2025 | Image CaptioningNegation | —Unverified | 0 | 0 |
| FindIt: Generalized Localization with Natural Language Queries | Mar 31, 2022 | Natural Language QueriesObject | —Unverified | 0 | 0 |
| Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks | Jul 14, 2023 | ObjectReferring Expression | —Unverified | 0 | 0 |