| Language-Conditioned Graph Networks for Relational Reasoning | May 10, 2019 | ObjectReferring Expression Comprehension | CodeCode Available | 0 | 5 |
| Language-Conditioned Feature Pyramids for Visual Selection Tasks | Nov 1, 2020 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 | 5 |
| Language Adaptive Weight Generation for Multi-task Visual Grounding | Jun 6, 2023 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 | 5 |
| Collecting Visually-Grounded Dialogue with A Game Of Sorts | Sep 10, 2023 | Coreference ResolutionImage Retrieval | CodeCode Available | 0 | 5 |
| Towards Language-guided Visual Recognition via Dynamic Convolutions | Oct 17, 2021 | Question AnsweringReferring Expression | CodeCode Available | 0 | 5 |
| HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt interaction tasks | Aug 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| A Joint Speaker-Listener-Reinforcer Model for Referring Expressions | Dec 30, 2016 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 | 5 |
| CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions | Jan 3, 2019 | DiagnosticImage Segmentation | CodeCode Available | 0 | 5 |
| A Real-time Global Inference Network for One-stage Referring Expression Comprehension | Dec 7, 2019 | Diversityfeature selection | CodeCode Available | 0 | 5 |
| Cosine meets Softmax: A tough-to-beat baseline for visual grounding | Sep 13, 2020 | Autonomous DrivingMetric Learning | CodeCode Available | 0 | 5 |