| Revisiting Counterfactual Problems in Referring Expression Comprehension | Jan 1, 2024 | AttributeContrastive Learning | CodeCode Available | 0 |
| MAttNet: Modular Attention Network for Referring Expression Comprehension | Jan 24, 2018 | Generalized Referring Expression SegmentationReferring Expression | CodeCode Available | 0 |
| Language-Conditioned Feature Pyramids for Visual Selection Tasks | Nov 1, 2020 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 |
| CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions | Jan 3, 2019 | DiagnosticImage Segmentation | CodeCode Available | 0 |
| Exploring Modulated Detection Transformer as a Tool for Action Recognition in Videos | Sep 21, 2022 | Action DetectionAction Recognition | CodeCode Available | 0 |
| Scene-Text Oriented Reffering Expression Comprehension | Nov 4, 2022 | Object LocalizationReferring Expression | CodeCode Available | 0 |
| Collecting Visually-Grounded Dialogue with A Game Of Sorts | Sep 10, 2023 | Coreference ResolutionImage Retrieval | CodeCode Available | 0 |
| HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt interaction tasks | Aug 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Adversarial Robustness for Visual Grounding of Multimodal Large Language Models | May 16, 2024 | Adversarial AttackAdversarial Robustness | CodeCode Available | 0 |
| A Joint Speaker-Listener-Reinforcer Model for Referring Expressions | Dec 30, 2016 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 |
| Understanding Synonymous Referring Expressions via Contrastive Features | Apr 20, 2021 | ObjectReferring Expression | CodeCode Available | 0 |
| CK-Transformer: Commonsense Knowledge Enhanced Transformers for Referring Expression Comprehension | Feb 17, 2023 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 |
| Natural Language Object Retrieval | Nov 13, 2015 | Image CaptioningImage Retrieval | CodeCode Available | 0 |
| Language-Conditioned Graph Networks for Relational Reasoning | May 10, 2019 | ObjectReferring Expression Comprehension | CodeCode Available | 0 |
| A Real-time Global Inference Network for One-stage Referring Expression Comprehension | Dec 7, 2019 | Diversityfeature selection | CodeCode Available | 0 |
| OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework | Feb 7, 2022 | Image Captioningimage-classification | CodeCode Available | 0 |
| Cosine meets Softmax: A tough-to-beat baseline for visual grounding | Sep 13, 2020 | Autonomous DrivingMetric Learning | CodeCode Available | 0 |