| MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning | Oct 14, 2023 | Image ClassificationImage Description | CodeCode Available | 7 | 5 |
| Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models | Mar 27, 2024 | Image ClassificationImage Comprehension | CodeCode Available | 7 | 5 |
| Visual Instruction Tuning | Apr 17, 2023 | 1 Image, 2*2 Stitching3D Question Answering (3D-QA) | CodeCode Available | 6 | 5 |
| Improved Baselines with Visual Instruction Tuning | Oct 5, 2023 | Factual Inconsistency Detection in Chart CaptioningImage Classification | CodeCode Available | 6 | 5 |
| Efficient Multimodal Learning from Data-centric Perspective | Feb 18, 2024 | Image ClassificationReferring Expression Comprehension | CodeCode Available | 5 | 5 |
| LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day | Jun 1, 2023 | Image ClassificationInstruction Following | CodeCode Available | 4 | 5 |
| MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices | Dec 28, 2023 | AutoMLCPU | CodeCode Available | 3 | 5 |
| Frontiers in Intelligent Colonoscopy | Oct 22, 2024 | Image Captioning | CodeCode Available | 2 | 5 |
| GLaMM: Pixel Grounding Large Multimodal Model | Nov 6, 2023 | Conversational Question AnsweringImage Captioning | CodeCode Available | 2 | 5 |
| Elysium: Exploring Object-level Perception in Videos via MLLM | Mar 25, 2024 | ObjectObject Tracking | CodeCode Available | 2 | 5 |
| Kosmos-2: Grounding Multimodal Large Language Models to the World | Jun 26, 2023 | Image CaptioningIn-Context Learning | CodeCode Available | 1 | 5 |
| Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE | Sep 26, 2024 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| Modeling Context in Referring Expressions | Jul 31, 2016 | Referring ExpressionReferring expression generation | CodeCode Available | 1 | 5 |
| Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception | Mar 5, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Improving Quality and Efficiency in Plan-based Neural Data-to-Text Generation | Sep 22, 2019 | Data-to-Text GenerationReferring Expression | CodeCode Available | 0 | 5 |
| Collecting Visually-Grounded Dialogue with A Game Of Sorts | Sep 10, 2023 | Coreference ResolutionImage Retrieval | CodeCode Available | 0 | 5 |
| Enriching the E2E dataset | Aug 1, 2021 | Referring ExpressionReferring expression generation | CodeCode Available | 0 | 5 |
| Enriching the WebNLG corpus | Nov 1, 2018 | Machine TranslationReferring Expression | CodeCode Available | 0 | 5 |
| Grounding Language in Multi-Perspective Referential Communication | Oct 4, 2024 | Referring ExpressionReferring expression generation | CodeCode Available | 0 | 5 |
| NeuralREG: An end-to-end approach to referring expression generation | May 21, 2018 | FormReferring Expression | CodeCode Available | 0 | 5 |
| Pento-DIARef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from Examples | May 24, 2023 | DiagnosticReferring Expression | CodeCode Available | 0 | 5 |
| Referring Expression Generation in Visually Grounded Dialogue with Discourse-aware Comprehension Guiding | Sep 9, 2024 | Image RetrievalReferring Expression | CodeCode Available | 0 | 5 |
| Referring Expression Generation Using Entity Profiles | Sep 4, 2019 | Referring ExpressionReferring expression generation | CodeCode Available | 0 | 5 |
| Resilience through Scene Context in Visual Referring Expression Generation | Apr 18, 2024 | Referring ExpressionReferring expression generation | CodeCode Available | 0 | 5 |
| Enhancing Visual Grounding and Generalization: A Multi-Task Cycle Training Approach for Vision-Language Models | Nov 21, 2023 | Image SegmentationLanguage Modelling | CodeCode Available | 0 | 5 |
| Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation | Apr 22, 2025 | Referring ExpressionReferring expression generation | CodeCode Available | 0 | 5 |
| Whether you can locate or not? Interactive Referring Expression Generation | Aug 19, 2023 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 0 | 5 |
| Gera \~ao de Express\~oes de Refer\^encia usando Rela \~oes Espaciais (Referring Expression Generation Using Spatial Relations) [in Portuguese] | Jan 1, 2013 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| CoNAN: A Complementary Neighboring-based Attention Network for Referring Expression Generation | Dec 1, 2020 | ObjectReferring Expression | —Unverified | 0 | 0 |
| Trainable Referring Expression Generation using Overspecification Preferences | Apr 12, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| G-TUNA: a corpus of referring expressions in German, including duration information | Sep 1, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Comprehension-guided referring expressions | Jan 12, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Improving the generation of personalised descriptions | Sep 1, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Improving the Naturalness and Diversity of Referring Expression Generation models using Minimum Risk Training | Dec 1, 2020 | DiversityReferring Expression | —Unverified | 0 | 0 |
| Informativity in Image Captions vs. Referring Expressions | Jun 1, 2020 | Image CaptioningObject | —Unverified | 0 | 0 |
| Intrinsic Task-based Evaluation for Referring Expression Generation | Feb 12, 2024 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Justifying Corpus-Based Choices in Referring Expression Generation | Sep 1, 2013 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Combining Referring Expression Generation and Surface Realization: A Corpus-Based Investigation of Architectures | Aug 1, 2013 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Learning Distributions over Logical Forms for Referring Expression Generation | Oct 1, 2013 | Density EstimationReferring Expression | —Unverified | 0 | 0 |
| Learning Preferences for Referring Expression Generation: Effects of Domain, Language and Algorithm | May 1, 2012 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Lessons from Computational Modelling of Reference Production in Mandarin and English | Nov 14, 2020 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Building Multimodal Simulations for Natural Language | Apr 1, 2017 | Formal LogicReferring Expression | —Unverified | 0 | 0 |
| Meteorologists and Students: A resource for language grounding of geographical descriptors | Sep 7, 2018 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Augmenting Robot Knowledge Consultants with Distributed Short Term Memory | Nov 26, 2018 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset | Oct 10, 2022 | FormReferring Expression | —Unverified | 0 | 0 |
| A Predictive Model for Notional Anaphora in English | Apr 19, 2018 | coreference-resolutionCoreference Resolution | —Unverified | 0 | 0 |
| An Incremental Iterated Response Model of Pragmatics | Sep 30, 2018 | modelReferring Expression | —Unverified | 0 | 0 |
| MuDoCo: Corpus for Multidomain Coreference Resolution and Referring Expression Generation | May 1, 2020 | coreference-resolutionCoreference Resolution | —Unverified | 0 | 0 |
| An Empirical Approach for Modeling Fuzzy Geographical Descriptors | Mar 30, 2017 | Referring ExpressionReferring expression generation | —Unverified | 0 | 0 |
| Adapting Descriptions of People to the Point of View of a Moving Observer | Nov 1, 2018 | PositionReferring Expression | —Unverified | 0 | 0 |