| A Unified Framework for 3D Point Cloud Visual Grounding | Aug 23, 2023 | CPUGPU | CodeCode Available | 1 |
| Explainable Neural Computation via Stack Neural Module Networks | Jul 23, 2018 | Decision MakingQuestion Answering | CodeCode Available | 1 |
| Compositional Attention Networks for Machine Reasoning | Mar 8, 2018 | Referring Expression ComprehensionVisual Question Answering (VQA) | CodeCode Available | 1 |
| MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension | Sep 20, 2024 | cross-modal alignmentReferring Expression | CodeCode Available | 1 |
| MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding | Apr 26, 2021 | Generalized Referring Expression ComprehensionPhrase Grounding | CodeCode Available | 1 |
| LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension | Sep 18, 2024 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 1 |
| Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations | Jun 30, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| InstructDET: Diversifying Referring Object Detection with Generalized Instructions | Oct 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| An Open and Comprehensive Pipeline for Unified Object Grounding and Detection | Jan 4, 2024 | Described Object DetectionPhrase Grounding | CodeCode Available | 1 |
| Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds | Dec 16, 2021 | Objectobject-detection | CodeCode Available | 1 |