| Visual Question Answering Using Semantic Information from Image Descriptions | Apr 23, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision | Apr 20, 2020 | counterfactualimage-classification | —Unverified | 0 |
| Knowledge-Based Visual Question Answering in Videos | Apr 17, 2020 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| Visual Grounding Methods for VQA are Working for the Wrong Reasons! | Apr 12, 2020 | Question AnsweringVisual Grounding | CodeCode Available | 1 |
| An Entropy Clustering Approach for Assessing Visual Question Difficulty | Apr 12, 2020 | ClusteringQuestion Answering | CodeCode Available | 0 |
| Rephrasing visual questions by specifying the entropy of the answer distribution | Apr 10, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing | Apr 8, 2020 | DiversityQuestion Answering | —Unverified | 0 |
| Evaluating Multimodal Representations on Visual Semantic Textual Similarity | Apr 4, 2020 | BenchmarkingImage Captioning | CodeCode Available | 1 |
| Generating Rationales in Visual Question Answering | Apr 4, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers | Apr 2, 2020 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| X-Linear Attention Networks for Image Captioning | Mar 31, 2020 | DecoderFine-Grained Visual Recognition | CodeCode Available | 1 |
| Assessing Image Quality Issues for Real-World Problems | Mar 27, 2020 | Image CaptioningQuestion Answering | —Unverified | 0 |
| P NP, at least in Visual Question Answering | Mar 26, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Linguistically Driven Graph Capsule Network for Visual Question Reasoning | Mar 23, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering for Cultural Heritage | Mar 22, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Normalized and Geometry-Aware Self-Attention Network for Image Captioning | Mar 19, 2020 | Image CaptioningMachine Translation | —Unverified | 0 |
| Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI | Mar 16, 2020 | BenchmarkingExplainable Artificial Intelligence (XAI) | CodeCode Available | 1 |
| RSVQA: Visual Question Answering for Remote Sensing Data | Mar 16, 2020 | Land Cover ClassificationObject Counting | —Unverified | 0 |
| Counterfactual Samples Synthesizing for Robust Visual Question Answering | Mar 14, 2020 | counterfactualQuestion Answering | CodeCode Available | 1 |
| MQA: Answering the Question via Robotic Manipulation | Mar 10, 2020 | Imitation LearningQuestion Answering | CodeCode Available | 0 |
| PathVQA: 30000+ Questions for Medical Visual Question Answering | Mar 7, 2020 | AI AgentMedical Visual Question Answering | CodeCode Available | 1 |
| Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning | Mar 6, 2020 | Density EstimationNoise Estimation | CodeCode Available | 0 |
| A Question-Centric Model for Visual Question Answering in Medical Imaging | Mar 2, 2020 | Medical Image AnalysisQuestion Answering | CodeCode Available | 0 |
| A Study on Multimodal and Interactive Explanations for Visual Question Answering | Mar 1, 2020 | Explainable Artificial Intelligence (XAI)Prediction | —Unverified | 0 |
| Unshuffling Data for Improved Generalization | Feb 27, 2020 | ClusteringData Augmentation | —Unverified | 0 |
| On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering | Feb 24, 2020 | Question AnsweringReferring Expression | —Unverified | 0 |
| VQA-LOL: Visual Question Answering under the Lens of Logic | Feb 19, 2020 | NegationQuestion Answering | —Unverified | 0 |
| CQ-VQA: Visual Question Answering on Categorized Questions | Feb 17, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Sparse and Structured Visual Attention | Feb 13, 2020 | Image CaptioningQuestion Answering | CodeCode Available | 0 |
| Component Analysis for Visual Question Answering Architectures | Feb 12, 2020 | Question AnsweringRepresentation Learning | —Unverified | 0 |
| Multimodal fusion of imaging and genomics for lung cancer recurrence prediction | Feb 5, 2020 | Computed Tomography (CT)Question Answering | CodeCode Available | 1 |
| Augmenting Visual Question Answering with Semantic Frame Information in a Multitask Learning Approach | Jan 31, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Uncertainty based Class Activation Maps for Visual Question Answering | Jan 23, 2020 | Deep LearningProbabilistic Deep Learning | —Unverified | 0 |
| Robust Explanations for Visual Question Answering | Jan 23, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models | Jan 20, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Recommending Themes for Ad Creative Design via Visual-Linguistic Representations | Jan 20, 2020 | Question AnsweringRecommendation Systems | CodeCode Available | 0 |
| Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features | Jan 14, 2020 | ClassificationDiversity | CodeCode Available | 1 |
| MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding | Jan 11, 2020 | Image CaptioningImage-text Retrieval | CodeCode Available | 0 |
| Visual Question Answering on 360° Images | Jan 10, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| In Defense of Grid Features for Visual Question Answering | Jan 10, 2020 | Image CaptioningQuestion Answering | CodeCode Available | 1 |
| Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering | Jan 3, 2020 | Question AnsweringVideo Description | —Unverified | 0 |
| Vision and Language: from Visual Perception to Content Creation | Dec 26, 2019 | DecoderQuestion Answering | —Unverified | 0 |
| Deep Exemplar Networks for VQA and VQG | Dec 19, 2019 | DecoderQuestion Answering | —Unverified | 0 |
| Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing | Dec 16, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| AI2D-RST: A multimodal corpus of 1000 primary school science diagrams | Dec 9, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks | Dec 6, 2019 | Image RetrievalInductive Bias | —Unverified | 0 |
| Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline | Dec 5, 2019 | Language ModellingRepresentation Learning | CodeCode Available | 1 |
| 12-in-1: Multi-Task Vision and Language Representation Learning | Dec 5, 2019 | 10-shot image generationImage Retrieval | CodeCode Available | 0 |
| Deep Bayesian Active Learning for Multiple Correct Outputs | Dec 2, 2019 | Active LearningAnswer Generation | —Unverified | 0 |
| TAB-VCR: Tags and Attributes based VCR Baselines | Dec 1, 2019 | AttributeQuestion Answering | CodeCode Available | 0 |