UFO: A UniFied TransfOrmer for Vision-Language Representation Learning Nov 19, 2021 Image Captioning Image-text matching
— Unverified 0Medical Visual Question Answering: A Survey Nov 19, 2021 Medical Visual Question Answering Question Answering
— Unverified 0Blind VQA on 360° Video via Progressively Learning from Pixels, Frames and Video Nov 18, 2021 Visual Question Answering (VQA)
Code Code Available 0Achieving Human Parity on Visual Question Answering Nov 17, 2021 Question Answering Visual Question Answering
— Unverified 0Co-VQA : Answering by Interactive Sub Question Sequence Nov 16, 2021 Question Answering Visual Question Answering
— Unverified 0Language bias in Visual Question Answering: A Survey and Taxonomy Nov 16, 2021 Question Answering Visual Question Answering
— Unverified 0Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation Nov 16, 2021 Image Captioning Knowledge Distillation
— Unverified 0Uncertainty-based Visual Question Answering: Estimating Semantic Inconsistency between Image and Knowledge Base Nov 16, 2021 Question Answering Semantic Similarity
— Unverified 0ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities Nov 16, 2021 Articles Face Recognition
Code Code Available 0Question-Led Semantic Structure Enhanced Attentions for VQA Nov 16, 2021 Question Answering Visual Question Answering
— Unverified 0Breaking Down Questions for Outside-Knowledge Visual Question Answering Nov 16, 2021 Graph Neural Network Question Answering
— Unverified 0A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models Nov 16, 2021 Language Modeling Language Modelling
— Unverified 0Document AI: Benchmarks, Models and Applications Nov 16, 2021 Deep Learning Document AI
— Unverified 0No-Reference Video Quality Assessment Based on Benford’s Law and Perceptual Features Nov 12, 2021 No-Reference Image Quality Assessment Video Quality Assessment
Code Code Available 0Graph Relation Transformer: Incorporating pairwise object features into the Transformer architecture Nov 11, 2021 Graph Attention Question Answering
— Unverified 0ICDAR 2021 Competition on Document VisualQuestion Answering Nov 10, 2021 Visual Question Answering (VQA)
— Unverified 0Visual Question Answering based on Formal Logic Nov 8, 2021 Formal Logic Question Answering
— Unverified 0CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization Nov 1, 2021 Answer Generation Question-Answer-Generation
— Unverified 0Diversity and Consistency: Exploring Visual Question-Answer Pair Generation Nov 1, 2021 Diversity Question Answering
— Unverified 0MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering Nov 1, 2021 multimodal interaction Multiple-choice
Code Code Available 0Perceptual Score: What Data Modalities Does Your Model Perceive? Oct 27, 2021 Question Answering Visual Dialog
Code Code Available 0Subtleties in the trainability of quantum machine learning models Oct 27, 2021 BIG-bench Machine Learning Quantum Machine Learning
— Unverified 0Alignment Attention by Matching Key and Query Distributions Oct 25, 2021 Graph Attention Question Answering
Code Code Available 0Robustness through Data Augmentation Loss Consistency Oct 21, 2021 Multi-domain Dialogue State Tracking Visual Question Answering
Code Code Available 0Single-Modal Entropy based Active Learning for Visual Question Answering Oct 21, 2021 Active Learning Question Answering
— Unverified 0Evaluating and Improving Interactions with Hazy Oracles Oct 19, 2021 Object Tracking Referring Expression
— Unverified 0Towards Language-guided Visual Recognition via Dynamic Convolutions Oct 17, 2021 Question Answering Referring Expression
Code Code Available 0Explore before Moving: A Feasible Path Estimation and Memory Recalling Framework for Embodied Navigation Oct 16, 2021 Common Sense Reasoning Embodied Question Answering
— Unverified 0xGQA: Cross-Lingual Visual Question Answering Oct 16, 2021 Cross-Lingual Transfer Language Modeling
— Unverified 0Guiding Visual Question Generation Oct 15, 2021 Question Generation Question-Generation
— Unverified 0Semantically Distributed Robust Optimization for Vision-and-Language Inference Oct 14, 2021 Data Augmentation Natural Language Inference
Code Code Available 0Improving Users' Mental Model with Attention-directed Counterfactual Edits Oct 13, 2021 counterfactual Question Answering
— Unverified 0MMIU: Dataset for Visual Intent Understanding in Multimodal Assistants Oct 13, 2021 intent-classification Intent Classification
— Unverified 0Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking Oct 11, 2021 Benchmarking Question Answering
Code Code Available 0Asking questions on handwritten document collections Oct 2, 2021 Optical Character Recognition (OCR) Question Answering
— Unverified 0Breaking Down Questions for Outside-Knowledge VQA Sep 29, 2021 Graph Neural Network Question Answering
— Unverified 0PRNet: A Progressive Regression Network for No-Reference User-Generated-Content Video Quality Assessment Sep 29, 2021 regression Video Quality Assessment
— Unverified 0Crossformer: Transformer with Alternated Cross-Layer Guidance Sep 29, 2021 Inductive Bias Machine Translation
— Unverified 0How Much Can CLIP Benefit Vision-and-Language Tasks? Sep 29, 2021 Question Answering Visual Entailment
— Unverified 0Variational Disentangled Attention for Regularized Visual Dialog Sep 29, 2021 Question Answering Visual Dialog
— Unverified 0Measuring CLEVRness: Black-box Testing of Visual Reasoning Models Sep 29, 2021 Benchmarking Diagnostic
— Unverified 0VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering Sep 27, 2021 Question Answering Visual Question Answering
— Unverified 0High Frame Rate Video Quality Assessment using VMAF and Entropic Differences Sep 27, 2021 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Multimodal Integration of Human-Like Attention in Visual Question Answering Sep 27, 2021 Question Answering Visual Question Answering
— Unverified 0How to find a good image-text embedding for remote sensing visual question answering? Sep 24, 2021 Question Answering Visual Question Answering
— Unverified 0Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering Sep 15, 2021 Image Captioning Knowledge Graphs
Code Code Available 0Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering Sep 13, 2021 Data Augmentation Question Answering
Code Code Available 0Towards Developing a Multilingual and Code-Mixed Visual Question Answering System by Knowledge Distillation Sep 10, 2021 Knowledge Distillation Question Answering
— Unverified 0TxT: Crossmodal End-to-End Learning with Transformers Sep 9, 2021 Multimodal Reasoning Question Answering
— Unverified 0Improved RAMEN: Towards Domain Generalization for Visual Question Answering Sep 6, 2021 Domain Generalization Question Answering
Code Code Available 0