| RUBi: Reducing Unimodal Biases for Visual Question Answering | Dec 1, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| A Free Lunch in Generating Datasets: Building a VQG and VQA System with Attention and Humans in the Loop | Nov 30, 2019 | Question AnsweringQuestion Generation | —Unverified | 0 |
| Assessing the Robustness of Visual Question Answering Models | Nov 30, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Unsupervised Keyword Extraction for Full-sentence VQA | Nov 23, 2019 | Keyword ExtractionQuestion Answering | —Unverified | 0 |
| Temporal Reasoning via Audio Question Answering | Nov 21, 2019 | Audio Question AnsweringDiagnostic | CodeCode Available | 0 |
| Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA | Nov 19, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue | Nov 17, 2019 | feature selectionQuestion Answering | CodeCode Available | 0 |
| Question-Conditioned Counterfactual Image Generation for VQA | Nov 14, 2019 | counterfactualImage Generation | —Unverified | 0 |
| Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation | Nov 11, 2019 | Domain AdaptationQuestion Answering | —Unverified | 0 |
| Multimodal Intelligence: Representation Learning, Information Fusion, and Applications | Nov 10, 2019 | Caption GenerationImage Generation | —Unverified | 0 |
| Are we asking the right questions in MovieQA? | Nov 8, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Representing Movie Characters in Dialogues | Nov 1, 2019 | Question AnsweringRelation Classification | —Unverified | 0 |
| YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension | Nov 1, 2019 | Caption GenerationQuestion Answering | —Unverified | 0 |
| TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines | Oct 31, 2019 | AttributeQuestion Answering | CodeCode Available | 0 |
| Learning Rich Image Region Representation for Visual Question Answering | Oct 29, 2019 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Enforcing Reasoning in Visual Commonsense Reasoning | Oct 21, 2019 | Question AnsweringReinforcement Learning | —Unverified | 0 |
| Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning | Oct 21, 2019 | Data AugmentationDecision Making | —Unverified | 0 |
| Neural Memory Plasticity for Anomaly Detection | Oct 12, 2019 | Anomaly DetectionEEG | —Unverified | 0 |
| Multi-modal Deep Analysis for Multimedia | Oct 11, 2019 | Multi-modal RecommendationQuestion Answering | —Unverified | 0 |
| Modulated Self-attention Convolutional Network for VQA | Oct 8, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| REMIND Your Neural Network to Prevent Catastrophic Forgetting | Oct 6, 2019 | QuantizationQuestion Answering | CodeCode Available | 0 |
| SegEQA: Video Segmentation Based Visual Attention for Embodied Question Answering | Oct 1, 2019 | Embodied Question AnsweringQuestion Answering | —Unverified | 0 |
| From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason | Oct 1, 2019 | Graph Neural NetworkQuestion Answering | —Unverified | 0 |
| On Incorporating Semantic Prior Knowledge in Deep Learning Through Embedding-Space Constraints | Sep 30, 2019 | Data AugmentationQuestion Answering | —Unverified | 0 |
| Overcoming Data Limitation in Medical Visual Question Answering | Sep 26, 2019 | DenoisingMedical Visual Question Answering | CodeCode Available | 1 |
| Compact Trilinear Interaction for Visual Question Answering | Sep 26, 2019 | BenchmarkingKnowledge Distillation | CodeCode Available | 0 |
| UNITER: Learning UNiversal Image-TExt Representations | Sep 25, 2019 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| Why Does the VQA Model Answer No?: Improving Reasoning through Visual and Linguistic Inference | Sep 25, 2019 | Common Sense ReasoningQuestion Answering | —Unverified | 0 |
| Learning to Recognize the Unseen Visual Predicates | Sep 25, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| On Incorporating Semantic Prior Knowlegde in Deep Learning Through Embedding-Space Constraints | Sep 25, 2019 | Data AugmentationQuestion Answering | —Unverified | 0 |
| UNITER: UNiversal Image-TExt Representation Learning | Sep 25, 2019 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| Unified Vision-Language Pre-Training for Image Captioning and VQA | Sep 24, 2019 | DecoderImage Captioning | CodeCode Available | 2 |
| Non-monotonic Logical Reasoning Guiding Deep Learning for Explainable Visual Question Answering | Sep 23, 2019 | Inductive LearningLogical Reasoning | —Unverified | 0 |
| Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network | Sep 23, 2019 | Question AnsweringTriplet | —Unverified | 0 |
| Triplet-Aware Scene Graph Embeddings | Sep 19, 2019 | Data AugmentationGraph Embedding | —Unverified | 0 |
| Learning Sparse Mixture of Experts for Visual Question Answering | Sep 19, 2019 | Mixture-of-ExpertsQuestion Answering | —Unverified | 0 |
| Inverse Visual Question Answering with Multi-Level Attentions | Sep 17, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation | Sep 10, 2019 | Common Sense ReasoningData Augmentation | —Unverified | 0 |
| Don't Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases | Sep 9, 2019 | Natural Language InferenceQuestion Answering | CodeCode Available | 1 |
| Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering | Sep 4, 2019 | Image CaptioningObject | —Unverified | 0 |
| Adversarial Representation Learning for Text-to-Image Matching | Aug 28, 2019 | Image CaptioningLanguage Modeling | —Unverified | 0 |
| Visual Question Answering using Deep Learning: A Survey and Performance Analysis | Aug 27, 2019 | Common Sense ReasoningQuestion Answering | CodeCode Available | 0 |
| VL-BERT: Pre-training of Generic Visual-Linguistic Representations | Aug 22, 2019 | Image-text matchingLanguage Modelling | CodeCode Available | 1 |
| LXMERT: Learning Cross-Modality Encoder Representations from Transformers | Aug 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| What is needed for simple spatial language capabilities in VQA? | Aug 17, 2019 | DiagnosticQuestion Answering | —Unverified | 0 |
| U-CAM: Visual Explanation using Uncertainty based Class Activation Maps | Aug 17, 2019 | Deep LearningProbabilistic Deep Learning | —Unverified | 0 |
| Language Features Matter: Effective Language Representations for Vision-Language Tasks | Aug 17, 2019 | Image CaptioningLanguage Modelling | —Unverified | 0 |
| Fusion of Detected Objects in Text for Visual Question Answering | Aug 14, 2019 | Question AnsweringVisual Commonsense Reasoning | —Unverified | 0 |
| Reactive Multi-Stage Feature Fusion for Multimodal Dialogue Modeling | Aug 14, 2019 | Question AnsweringScene-Aware Dialogue | —Unverified | 0 |
| Why Does a Visual Question Have Different Answers? | Aug 12, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |