| A Dataset and Baselines for Visual Question Answering on Art | Aug 28, 2020 | Question AnsweringQuestion Generation | CodeCode Available | 1 |
| Visual Question Answering on Image Sets | Aug 27, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Document Visual Question Answering Challenge 2020 | Aug 20, 2020 | Question AnsweringRetrieval | —Unverified | 0 |
| DeVLBert: Learning Deconfounded Visio-Linguistic Representations | Aug 16, 2020 | Image RetrievalQuestion Answering | CodeCode Available | 1 |
| Assisting Scene Graph Generation with Self-Supervision | Aug 8, 2020 | Graph GenerationImage Captioning | —Unverified | 0 |
| TRRNet: Tiered Relation Reasoning for Compositional Visual Question Answering | Aug 1, 2020 | ObjectQuestion Answering | —Unverified | 0 |
| Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision | Aug 1, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| REXUP: I REason, I EXtract, I UPdate with Structured Compositional Reasoning for Visual Question Answering | Jul 27, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering | Jul 19, 2020 | Adversarial AttackData Augmentation | CodeCode Available | 1 |
| Learning to Discretely Compose Reasoning Module Networks for Video Captioning | Jul 17, 2020 | DecoderQuestion Answering | CodeCode Available | 1 |
| Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder | Jul 13, 2020 | Question AnsweringVisual Grounding | —Unverified | 0 |
| Applying recent advances in Visual Question Answering to Record Linkage | Jul 12, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Image Captioning with Compositional Neural Module Networks | Jul 10, 2020 | Image CaptioningQuestion Answering | —Unverified | 0 |
| IQ-VQA: Intelligent Visual Question Answering | Jul 8, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Eliminating Catastrophic Interference with Biased Competition | Jul 3, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering as a Multi-Task Problem | Jul 3, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| The Impact of Explanations on AI Competency Prediction in VQA | Jul 2, 2020 | AI AgentLanguage Modeling | —Unverified | 0 |
| Scene Graph Reasoning for Visual Question Answering | Jul 2, 2020 | NavigateQuestion Answering | —Unverified | 0 |
| DocVQA: A Dataset for VQA on Document Images | Jul 1, 2020 | Question AnsweringReading Comprehension | CodeCode Available | 1 |
| Towards Visual Dialog for Radiology | Jul 1, 2020 | Question AnsweringVisual Dialog | —Unverified | 0 |
| Aligned Dual Channel Graph Convolutional Network for Visual Question Answering | Jul 1, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Multimodal Neural Graph Memory Networks for Visual Question Answering | Jul 1, 2020 | Graph Neural NetworkQuestion Answering | —Unverified | 0 |
| Ontology-guided Semantic Composition for Zero-Shot Learning | Jun 30, 2020 | image-classificationImage Classification | CodeCode Available | 1 |
| Improving VQA and its Explanations \\ by Comparing Competing Explanations | Jun 28, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Graph Optimal Transport for Cross-Domain Alignment | Jun 26, 2020 | Graph MatchingImage Captioning | CodeCode Available | 1 |
| Self-Segregating and Coordinated-Segregating Transformer for Focused Deep Multi-Modular Network for Visual Question Answering | Jun 25, 2020 | DiversityQuestion Answering | —Unverified | 0 |
| Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" | Jun 20, 2020 | Graph GenerationQuestion Answering | —Unverified | 0 |
| Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering | Jun 16, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| ORD: Object Relationship Discovery for Visual Dialogue Generation | Jun 15, 2020 | Dialogue GenerationGraph Attention | —Unverified | 0 |
| Sparse and Continuous Attention Mechanisms | Jun 12, 2020 | Machine TranslationQuestion Answering | CodeCode Available | 1 |
| Large-Scale Adversarial Training for Vision-and-Language Representation Learning | Jun 11, 2020 | Image-text RetrievalQuestion Answering | CodeCode Available | 1 |
| Exploring Weaknesses of VQA Models through Attribution Driven Insights | Jun 11, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning | Jun 11, 2020 | Question AnsweringReinforcement Learning (RL) | CodeCode Available | 1 |
| Estimating semantic structure for the VQA answer space | Jun 10, 2020 | General ClassificationQuestion Answering | —Unverified | 0 |
| Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To? | Jun 9, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 1 |
| Attention-Based Context Aware Reasoning for Situation Recognition | Jun 1, 2020 | Action RecognitionFine-grained Action Recognition | CodeCode Available | 1 |
| Counterfactual Vision and Language Learning | Jun 1, 2020 | counterfactualQuestion Answering | —Unverified | 0 |
| TA-Student VQA: Multi-Agents Training by Self-Questioning | Jun 1, 2020 | DiversityQuestion Answering | —Unverified | 0 |
| Multimodal grid features and cell pointers for Scene Text Visual Question Answering | Jun 1, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law | May 19, 2020 | Model SelectionQuestion Answering | —Unverified | 0 |
| Visual Relationship Detection using Scene Graphs: A Survey | May 16, 2020 | Graph GenerationImage Generation | —Unverified | 0 |
| Cross-Modality Relevance for Reasoning on Language and Vision | May 12, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 1 |
| COBRA: Contrastive Bi-Modal Representation Algorithm | May 7, 2020 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 1 |
| Visual Question Answering with Prior Class Semantics | May 4, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| A Corpus for Visual Question Answering Annotated with Frame Semantic Information | May 1, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Image Position Prediction in Multimodal Documents | May 1, 2020 | ArticlesCaption Generation | —Unverified | 0 |
| Dynamic Language Binding in Relational Visual Reasoning | Apr 30, 2020 | ObjectQuestion Answering | CodeCode Available | 1 |
| Pragmatic Issue-Sensitive Image Captioning | Apr 29, 2020 | DescriptiveImage Captioning | CodeCode Available | 0 |
| A Novel Attention-based Aggregation Function to Combine Vision and Language | Apr 27, 2020 | General ClassificationImage Captioning | —Unverified | 0 |
| Deep Multimodal Neural Architecture Search | Apr 25, 2020 | DecoderImage-text matching | CodeCode Available | 1 |