| Bayesian Attention Belief Networks | Jun 9, 2021 | DecoderMachine Translation | —Unverified | 0 |
| Check It Again: Progressive Visual Question Answering via Visual Entailment | Jun 8, 2021 | Question AnsweringVisual Entailment | CodeCode Available | 1 |
| PAM: Understanding Product Images in Cross Product Category Attribute Extraction | Jun 8, 2021 | AttributeAttribute Extraction | —Unverified | 0 |
| Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused Interventions | Jun 8, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Human-Adversarial Visual Question Answering | Jun 4, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Grounding Complex Navigational Instructions Using Scene Graphs | Jun 3, 2021 | Question Answeringreinforcement-learning | —Unverified | 0 |
| Semantic Aligned Multi-modal Transformer for Vision-LanguageUnderstanding: A Preliminary Study on Visual QA | Jun 1, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Learning to Select Question-Relevant Relations for Visual Question Answering | Jun 1, 2021 | Graph AttentionQuestion Answering | —Unverified | 0 |
| CLEVR\_HYP: A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images | Jun 1, 2021 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| MIMOQA: Multimodal Input Multimodal Output Question Answering | Jun 1, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| EaSe: A Diagnostic Tool for VQA based on Answer Diversity | Jun 1, 2021 | DiagnosticDiversity | CodeCode Available | 0 |
| Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models | Jun 1, 2021 | Data AugmentationQuestion Answering | —Unverified | 0 |
| LPF: A Language-Prior Feedback Objective Function for De-biased Visual Question Answering | May 29, 2021 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| StructuralLM: Structural Pre-training for Form Understanding | May 24, 2021 | document-image-classificationDocument Image Classification | —Unverified | 0 |
| Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training | May 24, 2021 | Image CaptioningMedical Visual Question Answering | CodeCode Available | 1 |
| Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training | May 21, 2021 | Question AnsweringRelation | —Unverified | 0 |
| Multiple Meta-model Quantifying for Medical Visual Question Answering | May 19, 2021 | Medical Visual Question AnsweringMeta-Learning | CodeCode Available | 1 |
| Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval | May 16, 2021 | Graph GenerationImage Captioning | —Unverified | 0 |
| Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention | May 15, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules | May 11, 2021 | Question AnsweringVisual Question Answering | CodeCode Available | 1 |
| Cross-Modal Generative Augmentation for Visual Question Answering | May 11, 2021 | Data AugmentationQuestion Answering | —Unverified | 0 |
| Passage Retrieval for Outside-Knowledge Visual Question Answering | May 9, 2021 | Image CaptioningObject | CodeCode Available | 1 |
| Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention | May 5, 2021 | Question AnsweringReferring Expression | —Unverified | 0 |
| AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss | May 5, 2021 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Iterated learning for emergent systematicity in VQA | May 3, 2021 | Question AnsweringSystematic Generalization | —Unverified | 0 |