| Multimodal Unified Attention Networks for Vision-and-Language Interactions | Aug 12, 2019 | Question AnsweringVisual Grounding | —Unverified | 0 |
| Multi-modality Latent Interaction Network for Visual Question Answering | Aug 10, 2019 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Question-Agnostic Attention for Visual Question Answering | Aug 9, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks | Aug 6, 2019 | Image RetrievalQuestion Answering | CodeCode Available | 1 |
| Answering Questions about Data Visualizations using Efficient Bimodal Fusion | Aug 5, 2019 | Chart Question AnsweringOptical Character Recognition | CodeCode Available | 0 |
| The Meaning of ``Most'' for Visual Question Answering Models | Aug 1, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation | Jul 31, 2019 | Conditional Image GenerationFew-Shot Learning | —Unverified | 0 |
| An Empirical Study on Leveraging Scene Graphs for Visual Question Answering | Jul 28, 2019 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| Bilinear Graph Networks for Visual Question Answering | Jul 23, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| KVQA: Knowledge-Aware Visual Question Answering | Jul 17, 2019 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| OmniNet: A unified architecture for multi-modal multi-task learning | Jul 17, 2019 | Image CaptioningMulti-Task Learning | CodeCode Available | 0 |
| 2nd Place Solution to the GQA Challenge 2019 | Jul 16, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Neural Reasoning, Fast and Slow, for Video Question Answering | Jul 10, 2019 | Natural QuestionsQuestion Answering | —Unverified | 0 |
| Multi-grained Attention with Object-level Grounding for Visual Question Answering | Jul 1, 2019 | ObjectQuestion Answering | —Unverified | 0 |
| ICDAR 2019 Competition on Scene Text Visual Question Answering | Jun 30, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Deep Modular Co-Attention Networks for Visual Question Answering | Jun 25, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| RUBi: Reducing Unimodal Biases in Visual Question Answering | Jun 24, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Integrating Knowledge and Reasoning in Image Understanding | Jun 24, 2019 | Object RecognitionQuestion Answering | —Unverified | 0 |
| Adversarial Multimodal Network for Movie Question Answering | Jun 24, 2019 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| Investigating Biases in Textual Entailment Datasets | Jun 23, 2019 | BIG-bench Machine LearningNatural Language Inference | —Unverified | 0 |
| Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects | Jun 20, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Improving Visual Question Answering by Referring to Generated Paragraph Captions | Jun 14, 2019 | DecoderImage Captioning | —Unverified | 0 |
| Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering | Jun 10, 2019 | Continual LearningQuestion Answering | —Unverified | 0 |
| Generating Question Relevant Captions to Aid Visual Question Answering | Jun 3, 2019 | General KnowledgeImage Captioning | —Unverified | 0 |
| Grounded Word Sense Translation | Jun 1, 2019 | Grounded language learningMachine Translation | —Unverified | 0 |
| ImageTTR: Grounding Type Theory with Records in Image Classification for Visual Question Answering | Jun 1, 2019 | General Classificationimage-classification | —Unverified | 0 |
| Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering | Jun 1, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge | May 31, 2019 | object-detectionObject Detection | CodeCode Available | 1 |
| Scene Text Visual Question Answering | May 31, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 1 |
| What Can Neural Networks Reason About? | May 30, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Vision-to-Language Tasks Based on Attributes and Attention Mechanism | May 29, 2019 | Image CaptioningQuestion Answering | —Unverified | 0 |
| Leveraging Medical Visual Question Answering with Supporting Facts | May 28, 2019 | DiversityMedical Visual Question Answering | —Unverified | 0 |
| Structure Learning for Neural Module Networks | May 27, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Why do These Match? Explaining the Behavior of Image Similarity Models | May 26, 2019 | AttributeGeneral Classification | CodeCode Available | 0 |
| Self-Critical Reasoning for Robust Visual Question Answering | May 24, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations | May 15, 2019 | Image CaptioningQuestion Answering | CodeCode Available | 0 |
| Quantifying and Alleviating the Language Prior Problem in Visual Question Answering | May 13, 2019 | Information RetrievalQuestion Answering | CodeCode Available | 0 |
| Visual TTR - Modelling Visual Question Answering in Type Theory with Records | May 1, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Routing Networks and the Challenges of Modular and Compositional Computation | Apr 29, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision | Apr 26, 2019 | Image-text RetrievalObject | CodeCode Available | 0 |
| Scene Graph Prediction with Limited Labels | Apr 25, 2019 | Knowledge Base CompletionPrediction | CodeCode Available | 0 |
| Question Guided Modular Routing Networks for Visual Question Answering | Apr 17, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Evaluating the Representational Hub of Language and Vision Models | Apr 12, 2019 | DiagnosticQuestion Answering | —Unverified | 0 |
| Factor Graph Attention | Apr 11, 2019 | Graph AttentionQuestion Answering | CodeCode Available | 0 |
| Actively Seeking and Learning from Live Data | Apr 5, 2019 | Domain AdaptationMeta-Learning | —Unverified | 0 |
| Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval | Apr 5, 2019 | Image RetrievalQuestion Answering | —Unverified | 0 |
| MMED: A Multi-domain and Multi-modality Event Dataset | Apr 4, 2019 | ArticlesQuestion Answering | —Unverified | 0 |
| Relation-Aware Graph Attention Network for Visual Question Answering | Mar 29, 2019 | Graph AttentionImplicit Relations | CodeCode Available | 0 |
| RAVEN: A Dataset for Relational and Analogical Visual rEasoNing | Mar 7, 2019 | Object RecognitionQuestion Answering | —Unverified | 0 |
| Answer Them All! Toward Universal Visual Question Answering Models | Mar 1, 2019 | AllQuestion Answering | CodeCode Available | 0 |