| Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network | Sep 23, 2019 | Question AnsweringTriplet | —Unverified | 0 |
| Non-monotonic Logical Reasoning Guiding Deep Learning for Explainable Visual Question Answering | Sep 23, 2019 | Inductive LearningLogical Reasoning | —Unverified | 0 |
| Triplet-Aware Scene Graph Embeddings | Sep 19, 2019 | Data AugmentationGraph Embedding | —Unverified | 0 |
| Learning Sparse Mixture of Experts for Visual Question Answering | Sep 19, 2019 | Mixture-of-ExpertsQuestion Answering | —Unverified | 0 |
| Inverse Visual Question Answering with Multi-Level Attentions | Sep 17, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation | Sep 10, 2019 | Common Sense ReasoningData Augmentation | —Unverified | 0 |
| Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering | Sep 4, 2019 | Image CaptioningObject | —Unverified | 0 |
| Adversarial Representation Learning for Text-to-Image Matching | Aug 28, 2019 | Image CaptioningLanguage Modeling | —Unverified | 0 |
| Visual Question Answering using Deep Learning: A Survey and Performance Analysis | Aug 27, 2019 | Common Sense ReasoningQuestion Answering | CodeCode Available | 0 |
| Language Features Matter: Effective Language Representations for Vision-Language Tasks | Aug 17, 2019 | Image CaptioningLanguage Modelling | —Unverified | 0 |
| U-CAM: Visual Explanation using Uncertainty based Class Activation Maps | Aug 17, 2019 | Deep LearningProbabilistic Deep Learning | —Unverified | 0 |
| What is needed for simple spatial language capabilities in VQA? | Aug 17, 2019 | DiagnosticQuestion Answering | —Unverified | 0 |
| Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling | Aug 14, 2019 | Question AnsweringScene-Aware Dialogue | —Unverified | 0 |
| Fusion of Detected Objects in Text for Visual Question Answering | Aug 14, 2019 | Question AnsweringVisual Commonsense Reasoning | —Unverified | 0 |
| Why Does a Visual Question Have Different Answers? | Aug 12, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Multimodal Unified Attention Networks for Vision-and-Language Interactions | Aug 12, 2019 | Question AnsweringVisual Grounding | —Unverified | 0 |
| Multi-modality Latent Interaction Network for Visual Question Answering | Aug 10, 2019 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Question-Agnostic Attention for Visual Question Answering | Aug 9, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Answering Questions about Data Visualizations using Efficient Bimodal Fusion | Aug 5, 2019 | Chart Question AnsweringOptical Character Recognition | CodeCode Available | 0 |
| The Meaning of ``Most'' for Visual Question Answering Models | Aug 1, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation | Jul 31, 2019 | Conditional Image GenerationFew-Shot Learning | —Unverified | 0 |
| An Empirical Study on Leveraging Scene Graphs for Visual Question Answering | Jul 28, 2019 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| Bilinear Graph Networks for Visual Question Answering | Jul 23, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| KVQA: Knowledge-Aware Visual Question Answering | Jul 17, 2019 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| OmniNet: A unified architecture for multi-modal multi-task learning | Jul 17, 2019 | Image CaptioningMulti-Task Learning | CodeCode Available | 0 |
| 2nd Place Solution to the GQA Challenge 2019 | Jul 16, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Neural Reasoning, Fast and Slow, for Video Question Answering | Jul 10, 2019 | Natural QuestionsQuestion Answering | —Unverified | 0 |
| Multi-grained Attention with Object-level Grounding for Visual Question Answering | Jul 1, 2019 | ObjectQuestion Answering | —Unverified | 0 |
| ICDAR 2019 Competition on Scene Text Visual Question Answering | Jun 30, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Deep Modular Co-Attention Networks for Visual Question Answering | Jun 25, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Adversarial Multimodal Network for Movie Question Answering | Jun 24, 2019 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| Integrating Knowledge and Reasoning in Image Understanding | Jun 24, 2019 | Object RecognitionQuestion Answering | —Unverified | 0 |
| RUBi: Reducing Unimodal Biases in Visual Question Answering | Jun 24, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Investigating Biases in Textual Entailment Datasets | Jun 23, 2019 | BIG-bench Machine LearningNatural Language Inference | —Unverified | 0 |
| Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects | Jun 20, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Improving Visual Question Answering by Referring to Generated Paragraph Captions | Jun 14, 2019 | DecoderImage Captioning | —Unverified | 0 |
| Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering | Jun 10, 2019 | Continual LearningQuestion Answering | —Unverified | 0 |
| Generating Question Relevant Captions to Aid Visual Question Answering | Jun 3, 2019 | General KnowledgeImage Captioning | —Unverified | 0 |
| Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering | Jun 1, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Grounded Word Sense Translation | Jun 1, 2019 | Grounded language learningMachine Translation | —Unverified | 0 |
| ImageTTR: Grounding Type Theory with Records in Image Classification for Visual Question Answering | Jun 1, 2019 | General Classificationimage-classification | —Unverified | 0 |
| What Can Neural Networks Reason About? | May 30, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Vision-to-Language Tasks Based on Attributes and Attention Mechanism | May 29, 2019 | Image CaptioningQuestion Answering | —Unverified | 0 |
| Leveraging Medical Visual Question Answering with Supporting Facts | May 28, 2019 | DiversityMedical Visual Question Answering | —Unverified | 0 |
| Structure Learning for Neural Module Networks | May 27, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Why do These Match? Explaining the Behavior of Image Similarity Models | May 26, 2019 | AttributeGeneral Classification | CodeCode Available | 0 |
| Self-Critical Reasoning for Robust Visual Question Answering | May 24, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations | May 15, 2019 | Image CaptioningQuestion Answering | CodeCode Available | 0 |
| Quantifying and Alleviating the Language Prior Problem in Visual Question Answering | May 13, 2019 | Information RetrievalQuestion Answering | CodeCode Available | 0 |
| Visual TTR - Modelling Visual Question Answering in Type Theory with Records | May 1, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |