ICDAR 2021 Competition on Document VisualQuestion Answering Nov 10, 2021 Visual Question Answering (VQA)
— Unverified 0Visual Question Answering based on Formal Logic Nov 8, 2021 Formal Logic Question Answering
— Unverified 0An Empirical Study of Training End-to-End Vision-and-Language Transformers Nov 3, 2021 Cross-Modal Retrieval Decoder
Code Code Available 1VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts Nov 3, 2021 Image Retrieval Image-text Retrieval
Code Code Available 1ViVQA: Vietnamese Visual Question Answering Nov 1, 2021 Question Answering Vietnamese Visual Question Answering
Code Code Available 1CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization Nov 1, 2021 Answer Generation Question-Answer-Generation
— Unverified 0Diversity and Consistency: Exploring Visual Question-Answer Pair Generation Nov 1, 2021 Diversity Question Answering
— Unverified 0MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering Nov 1, 2021 multimodal interaction Multiple-choice
Code Code Available 0Introspective Distillation for Robust Question Answering Nov 1, 2021 counterfactual Inductive Bias
Code Code Available 1Subtleties in the trainability of quantum machine learning models Oct 27, 2021 BIG-bench Machine Learning Quantum Machine Learning
— Unverified 0Perceptual Score: What Data Modalities Does Your Model Perceive? Oct 27, 2021 Question Answering Visual Dialog
Code Code Available 0IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning Oct 25, 2021 Arithmetic Reasoning Mathematical Question Answering
Code Code Available 1Alignment Attention by Matching Key and Query Distributions Oct 25, 2021 Graph Attention Question Answering
Code Code Available 0Single-Modal Entropy based Active Learning for Visual Question Answering Oct 21, 2021 Active Learning Question Answering
— Unverified 0Robustness through Data Augmentation Loss Consistency Oct 21, 2021 Multi-domain Dialogue State Tracking Visual Question Answering
Code Code Available 0Evaluating and Improving Interactions with Hazy Oracles Oct 19, 2021 Object Tracking Referring Expression
— Unverified 0Label-Descriptive Patterns and Their Application to Characterizing Classification Errors Oct 18, 2021 Descriptive named-entity-recognition
Code Code Available 1Towards Language-guided Visual Recognition via Dynamic Convolutions Oct 17, 2021 Question Answering Referring Expression
Code Code Available 0xGQA: Cross-Lingual Visual Question Answering Oct 16, 2021 Cross-Lingual Transfer Language Modeling
— Unverified 0A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models Oct 16, 2021 Image Captioning Language Modeling
Code Code Available 1Explore before Moving: A Feasible Path Estimation and Memory Recalling Framework for Embodied Navigation Oct 16, 2021 Common Sense Reasoning Embodied Question Answering
— Unverified 0Guiding Visual Question Generation Oct 15, 2021 Question Generation Question-Generation
— Unverified 0Semantically Distributed Robust Optimization for Vision-and-Language Inference Oct 14, 2021 Data Augmentation Natural Language Inference
Code Code Available 0Improving Users' Mental Model with Attention-directed Counterfactual Edits Oct 13, 2021 counterfactual Question Answering
— Unverified 0MMIU: Dataset for Visual Intent Understanding in Multimodal Assistants Oct 13, 2021 intent-classification Intent Classification
— Unverified 0Pano-AVQA: Grounded Audio-Visual Question Answering on 360^ Videos Oct 11, 2021 Audio-visual Question Answering Question Answering
Code Code Available 1Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking Oct 11, 2021 Benchmarking Question Answering
Code Code Available 0Coarse-to-Fine Reasoning for Visual Question Answering Oct 6, 2021 Question Answering Visual Question Answering
Code Code Available 1Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering Oct 3, 2021 counterfactual Diagnostic
Code Code Available 1ProTo: Program-Guided Transformer for Program-Guided Tasks Oct 2, 2021 Decision Making Learning to Execute
Code Code Available 1Asking questions on handwritten document collections Oct 2, 2021 Optical Character Recognition (OCR) Question Answering
— Unverified 0The Spoon Is in the Sink: Assisting Visually Impaired People in the Kitchen Oct 1, 2021 Question Answering Visual Question Answering
Code Code Available 1Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images Oct 1, 2021 Question Answering Visual Question Answering
Code Code Available 1Breaking Down Questions for Outside-Knowledge VQA Sep 29, 2021 Graph Neural Network Question Answering
— Unverified 0PRNet: A Progressive Regression Network for No-Reference User-Generated-Content Video Quality Assessment Sep 29, 2021 regression Video Quality Assessment
— Unverified 0Variational Disentangled Attention for Regularized Visual Dialog Sep 29, 2021 Question Answering Visual Dialog
— Unverified 0How Much Can CLIP Benefit Vision-and-Language Tasks? Sep 29, 2021 Question Answering Visual Entailment
— Unverified 0Measuring CLEVRness: Black-box Testing of Visual Reasoning Models Sep 29, 2021 Benchmarking Diagnostic
— Unverified 0Crossformer: Transformer with Alternated Cross-Layer Guidance Sep 29, 2021 Inductive Bias Machine Translation
— Unverified 0High Frame Rate Video Quality Assessment using VMAF and Entropic Differences Sep 27, 2021 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering Sep 27, 2021 Question Answering Visual Question Answering
— Unverified 0Multimodal Integration of Human-Like Attention in Visual Question Answering Sep 27, 2021 Question Answering Visual Question Answering
— Unverified 0How to find a good image-text embedding for remote sensing visual question answering? Sep 24, 2021 Question Answering Visual Question Answering
— Unverified 0Does Vision-and-Language Pretraining Improve Lexical Grounding? Sep 21, 2021 Question Answering Visual Question Answering
Code Code Available 1ChipQA: No-Reference Video Quality Prediction via Space-Time Chips Sep 17, 2021 Video Quality Assessment Visual Question Answering (VQA)
Code Code Available 1Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering Sep 15, 2021 Image Captioning Knowledge Graphs
Code Code Available 0xGQA: Cross-Lingual Visual Question Answering Sep 13, 2021 Cross-Lingual Transfer Language Modeling
Code Code Available 1Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering Sep 13, 2021 Data Augmentation Question Answering
Code Code Available 0Towards Developing a Multilingual and Code-Mixed Visual Question Answering System by Knowledge Distillation Sep 10, 2021 Knowledge Distillation Question Answering
— Unverified 0An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA Sep 10, 2021 Image Captioning Question Answering
Code Code Available 1