TxT: Crossmodal End-to-End Learning with Transformers Sep 9, 2021 Multimodal Reasoning Question Answering
— Unverified 0Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question Answering Sep 9, 2021 Question Answering Retrieval
Code Code Available 1GeneAnnotator: A Semi-automatic Annotation Tool for Visual Scene Graph Sep 6, 2021 Graph Generation Graph Learning
Code Code Available 1Improved RAMEN: Towards Domain Generalization for Visual Question Answering Sep 6, 2021 Domain Generalization Question Answering
Code Code Available 0Weakly Supervised Relative Spatial Reasoning for Visual Question Answering Sep 4, 2021 Question Answering Spatial Reasoning
Code Code Available 0A review of Quantum Neural Networks: Methods, Models, Dilemma Sep 4, 2021 Computational Efficiency Visual Question Answering (VQA)
— Unverified 0WebQA: Multihop and Multimodal QA Sep 1, 2021 Image Retrieval Multimodal Reasoning
Code Code Available 1QACE: Asking Questions to Evaluate an Image Caption Aug 28, 2021 Question Answering Visual Question Answering (VQA)
Code Code Available 0On the Significance of Question Encoder Sequence Model in the Out-of-Distribution Performance in Visual Question Answering Aug 28, 2021 Graph Attention Question Answering
— Unverified 0SimVLM: Simple Visual Language Model Pretraining with Weak Supervision Aug 24, 2021 Image Captioning Language Modeling
Code Code Available 1Auto-Parsing Network for Image Captioning and Visual Question Answering Aug 24, 2021 Image Captioning Question Answering
— Unverified 0EKTVQA: Generalized use of External Knowledge to empower Scene Text in Text-VQA Aug 22, 2021 Open-Ended Question Answering Optical Character Recognition (OCR)
— Unverified 0StarVQA: Space-Time Attention for Video Quality Assessment Aug 22, 2021 Video Quality Assessment Visual Question Answering (VQA)
Code Code Available 0Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling Aug 20, 2021 Data Ablation Optical Character Recognition
— Unverified 0Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception Aug 19, 2021 Action Recognition Image Quality Assessment
Code Code Available 1X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics Aug 18, 2021 Cross-Modal Retrieval Decoder
Code Code Available 1VALSE: A Task-Independent Benchmark for Vision and Language Models centered on Linguistic Phenomena Aug 17, 2021 Question Answering Visual Question Answering
— Unverified 0Task-Oriented Multi-User Semantic Communications for VQA Task Aug 16, 2021 Question Answering Semantic Communication
Code Code Available 1BERTHop: An Effective Vision-and-Language Model for Chest X-ray Disease Diagnosis Aug 10, 2021 Language Modeling Language Modelling
Code Code Available 0Sparse Continuous Distributions and Fenchel-Young Losses Aug 4, 2021 Audio Classification Question Answering
Code Code Available 1LRRA:A Transparent Neural-Symbolic Reasoning Framework for Real-World Visual Question Answering Aug 1, 2021 Question Answering Visual Question Answering
— Unverified 0利用图像描述与知识图谱增强表示的视觉问答(Exploiting Image Captions and External Knowledge as Representation Enhancement for Visual Question Answering) Aug 1, 2021 Image Captioning Question Answering
— Unverified 0Check It Again:Progressive Visual Question Answering via Visual Entailment Aug 1, 2021 Question Answering Visual Entailment
Code Code Available 1Towards Visual Question Answering on Pathology Images Aug 1, 2021 Decision Making Question Answering
Code Code Available 0In Factuality: Efficient Integration of Relevant Facts for Visual Question Answering Aug 1, 2021 Question Answering Visual Question Answering
— Unverified 0Greedy Gradient Ensemble for Robust Visual Question Answering Jul 27, 2021 Question Answering Visual Question Answering
Code Code Available 1X-GGM: Graph Generative Modeling for Out-of-Distribution Generalization in Visual Question Answering Jul 24, 2021 Attribute Out-of-Distribution Generalization
Code Code Available 0Separating Skills and Concepts for Novel Visual Question Answering Jul 19, 2021 Attribute Contrastive Learning
Code Code Available 1Align before Fuse: Vision and Language Representation Learning with Momentum Distillation Jul 16, 2021 Cross-Modal Retrieval Grounded language learning
Code Code Available 1How Much Can CLIP Benefit Vision-and-Language Tasks? Jul 13, 2021 Question Answering Vision and Language Navigation
Code Code Available 1Graphhopper: Multi-Hop Scene Graph Reasoning for Visual Question Answering Jul 13, 2021 Navigate Question Answering
Code Code Available 1Zero-shot Visual Question Answering using Knowledge Graph Jul 12, 2021 Knowledge Graphs Question Answering
Code Code Available 1DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering Jul 10, 2021 Graph Attention Question Answering
Code Code Available 1MuVAM: A Multi-View Attention-based Model for Medical Visual Question Answering Jul 7, 2021 Medical Visual Question Answering Missing Labels
— Unverified 0Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering Jul 6, 2021 Active Learning Object Recognition
Code Code Available 1Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory Jul 4, 2021 Question Answering Scene Understanding
Code Code Available 0Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs Jun 28, 2021 Question Answering Task 2
— Unverified 0Multimodal Few-Shot Learning with Frozen Language Models Jun 25, 2021 Few-Shot Learning Language Modeling
— Unverified 0Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training Jun 25, 2021 Image-text Retrieval Question Answering
— Unverified 0A Picture May Be Worth a Hundred Words for Visual Question Answering Jun 25, 2021 Data Augmentation Descriptive
— Unverified 0FOVQA: Blind Foveated Video Quality Assessment Jun 24, 2021 Video Compression Video Quality Assessment
— Unverified 0A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021 Jun 24, 2021 Visual Question Answering (VQA)
— Unverified 0NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions Jun 19, 2021 Question Answering Video Question Answering
Code Code Available 1Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing Jun 19, 2021 Benchmarking DNN Testing
Code Code Available 1Predicting Human Scanpaths in Visual Question Answering Jun 19, 2021 Deep Reinforcement Learning Question Answering
Code Code Available 1RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words Jun 19, 2021 Decoder Image Captioning
Code Code Available 1VQA-Aid: Visual Question Answering for Post-Disaster Damage Assessment and Analysis Jun 19, 2021 Question Answering Visual Question Answering
— Unverified 0Probing Image-Language Transformers for Verb Understanding Jun 16, 2021 Image Retrieval Question Answering
Code Code Available 1Assessment of Subjective and Objective Quality of Live Streaming Sports Videos Jun 15, 2021 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0How Modular Should Neural Module Networks Be for Systematic Generalization? Jun 15, 2021 Question Answering Systematic Generalization
Code Code Available 0