| Recent, rapid advancement in visual question answering architecture: a review | Mar 2, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| On Modality Bias Recognition and Reduction | Feb 25, 2022 | Action RecognitionMulti-modal Classification | CodeCode Available | 0 |
| Joint Answering and Explanation for Visual Commonsense Reasoning | Feb 25, 2022 | Knowledge DistillationQuestion Answering | CodeCode Available | 0 |
| Measuring CLEVRness: Blackbox testing of Visual Reasoning Models | Feb 24, 2022 | BenchmarkingDiagnostic | —Unverified | 0 |
| OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer Learning for Telepresence Robotics | Feb 21, 2022 | BIG-bench Machine LearningGraph Generation | CodeCode Available | 0 |
| Vision-Language Pre-Training with Triple Contrastive Learning | Feb 21, 2022 | Contrastive Learningcross-modal alignment | CodeCode Available | 2 |
| Privacy Preserving Visual Question Answering | Feb 15, 2022 | Privacy PreservingQuestion Answering | —Unverified | 0 |
| Delving Deeper into Cross-lingual Visual Question Answering | Feb 15, 2022 | Inductive BiasQuestion Answering | CodeCode Available | 0 |
| An experimental study of the vision-bottleneck in VQA | Feb 14, 2022 | ObjectQuestion Answering | —Unverified | 0 |
| Can Open Domain Question Answering Systems Answer Visual Knowledge Questions? | Feb 9, 2022 | Open-Domain Question AnsweringQuestion Answering | —Unverified | 0 |
| NEWSKVQA: Knowledge-Aware News Video Question Answering | Feb 8, 2022 | Common Sense ReasoningManagement | —Unverified | 0 |
| OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework | Feb 7, 2022 | Image Captioningimage-classification | CodeCode Available | 0 |
| Grounding Answers for Visual Questions Asked by Visually Impaired People | Feb 4, 2022 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Compositionality as Lexical Symmetry | Jan 30, 2022 | Data AugmentationInductive Bias | CodeCode Available | 0 |
| Transformer Module Networks for Systematic Generalization in Visual Question Answering | Jan 27, 2022 | Question AnsweringSystematic Generalization | CodeCode Available | 0 |
| IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages | Jan 27, 2022 | Cross-Modal RetrievalFew-Shot Learning | CodeCode Available | 1 |
| Learning to Compose Diversified Prompts for Image Emotion Classification | Jan 26, 2022 | ClassificationEmotion Classification | —Unverified | 0 |
| MGA-VQA: Multi-Granularity Alignment for Visual Question Answering | Jan 25, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| SA-VQA: Structured Alignment of Visual and Semantic Representations for Visual Question Answering | Jan 25, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal Grounding | Jan 24, 2022 | Question AnsweringQuestion Generation | —Unverified | 0 |
| MANGO: Enhancing the Robustness of VQA Models via Adversarial Noise Generation | Jan 16, 2022 | Logical ReasoningQuestion Answering | —Unverified | 0 |
| Task Formulation Matters When Learning Continuously: A Case Study in Visual Question Answering | Jan 16, 2022 | Continual LearningIncremental Learning | —Unverified | 0 |
| Retrieving Visual Facts For Few-Shot Visual Question Answering | Jan 16, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Probing the Role of Positional Information in Vision-Language Models | Jan 16, 2022 | Contrastive LearningImage-text matching | —Unverified | 0 |
| All You May Need for VQA are Image Captions | Jan 16, 2022 | AllImage Captioning | —Unverified | 0 |