VisQA: X-raying Vision and Language Reasoning in Transformers Apr 2, 2021 Question Answering Visual Question Answering
Code Code Available 1`Just because you are right, doesn't mean I am wrong': Overcoming a bottleneck in development and evaluation of Open-Ended VQA tasks Apr 1, 2021 Question Answering Visual Question Answering
— Unverified 0Towards General Purpose Vision Systems Apr 1, 2021 Question Answering Visual Question Answering
Code Code Available 1UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Apr 1, 2021 Image-text matching Image-text Retrieval
— Unverified 0Are Bias Mitigation Techniques for Deep Learning Effective? Apr 1, 2021 Deep Learning Question Answering
Code Code Available 1Analysis on Image Set Visual Question Answering Mar 31, 2021 Question Answering Visual Question Answering
— Unverified 0Domain-robust VQA with diverse datasets and methods but no target labels Mar 29, 2021 Domain Adaptation Object Recognition
— Unverified 0SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events Mar 29, 2021 Autonomous Vehicles Benchmarking
Code Code Available 1Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers Mar 29, 2021 Decoder Image Segmentation
Code Code Available 1'Just because you are right, doesn't mean I am wrong': Overcoming a Bottleneck in the Development and Evaluation of Open-Ended Visual Question Answering (VQA) Tasks Mar 28, 2021 Question Answering Visual Question Answering
Code Code Available 0Generating and Evaluating Explanations of Attended and Error-Inducing Input Regions for VQA Models Mar 26, 2021 Question Answering Visual Question Answering
— Unverified 0On the hidden treasure of dialog in video question answering Mar 26, 2021 Question Answering Video Question Answering
Code Code Available 1Visual Grounding Strategies for Text-Only Natural Language Processing Mar 25, 2021 Image Retrieval Language Modeling
— Unverified 0Multi-Modal Answer Validation for Knowledge-Based VQA Mar 23, 2021 Question Answering Retrieval
Code Code Available 1How to Design Sample and Computationally Efficient VQA Models Mar 22, 2021 Question Answering Visual Question Answering
— Unverified 0A Comprehensive Survey of Scene Graphs: Generation and Application Mar 17, 2021 Image Captioning Question Answering
— Unverified 0Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA Mar 17, 2021 Question Answering Relational Reasoning
Code Code Available 0VMAF And Variants: Towards A Unified VQA Mar 13, 2021 feature selection regression
— Unverified 0Characterizing Misclassifications of Deep NLP Models Mar 12, 2021 named-entity-recognition Named Entity Recognition
— Unverified 0RL-CSDia: Representation Learning of Computer Science Diagrams Mar 10, 2021 Question Answering Representation Learning
— Unverified 0Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering Mar 9, 2021 Optical Character Recognition (OCR) Question Answering
Code Code Available 0Contextual Dropout: An Efficient Sample-Dependent Dropout Module Mar 6, 2021 image-classification Image Classification
Code Code Available 0Visual Question Answering: which investigated applications? Mar 4, 2021 Image Captioning Question Answering
Code Code Available 0Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues Mar 1, 2021 Question Answering Visual Question Answering
— Unverified 0Learning Compositional Representation for Few-shot Visual Question Answering Feb 21, 2021 Attribute Question Answering
— Unverified 0SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering Feb 18, 2021 Medical Visual Question Answering Question Answering
Code Code Available 1Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer Feb 18, 2021 Decoder Document Image Classification
Code Code Available 1Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts Feb 17, 2021 Caption Generation Diversity
Code Code Available 1Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling Feb 11, 2021 Question Answering Retrieval
Code Code Available 1ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision Feb 5, 2021 Cross-Modal Retrieval Image Retrieval
Code Code Available 1Unifying Vision-and-Language Tasks via Text Generation Feb 4, 2021 Conditional Text Generation Decoder
Code Code Available 1Answer Questions with Right Image Regions: A Visual Attention Regularization Approach Feb 3, 2021 Question Answering Visual Grounding
Code Code Available 0An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games Jan 31, 2021 Question Answering Visual Question Answering
— Unverified 0VisualMRC: Machine Reading Comprehension on Document Images Jan 27, 2021 Machine Reading Comprehension Natural Language Understanding
Code Code Available 1Unanswerable Questions about Images and Texts Jan 25, 2021 Question Answering Visual Question Answering
— Unverified 0Visual Question Answering based on Local-Scene-Aware Referring Expression Generation Jan 22, 2021 Question Answering Referring Expression
— Unverified 0Understanding in Artificial Intelligence Jan 17, 2021 Natural Language Understanding Question Answering
— Unverified 0Latent Variable Models for Visual Question Answering Jan 16, 2021 Benchmarking Question Answering
— Unverified 0Recent Advances in Video Question Answering: A Review of Datasets and Methods Jan 15, 2021 Information Retrieval Machine Translation
— Unverified 0Reasoning over Vision and Language: Exploring the Benefits of Supplemental Knowledge Jan 15, 2021 Question Answering Visual Question Answering (VQA)
— Unverified 0Understanding the Role of Scene Graphs in Visual Question Answering Jan 14, 2021 Graph Generation Question Answering
— Unverified 0Predicting Relative Depth between Objects from Semantic Features Jan 12, 2021 Question Answering Visual Question Answering
— Unverified 0Self Supervision for Attention Networks Jan 6, 2021 image-classification Image Classification
Code Code Available 0Transformers in Vision: A Survey Jan 4, 2021 Action Recognition Activity Recognition
— Unverified 0Hierarchical Graph Attention Network for Few-Shot Visual-Semantic Learning Jan 1, 2021 Graph Attention Image Captioning
— Unverified 0Multimodal Co-Attention Transformer for Survival Prediction in Gigapixel Whole Slide Images Jan 1, 2021 Attribute Multiple Instance Learning
Code Code Available 1Pano-AVQA: Grounded Audio-Visual Question Answering on 360deg Videos Jan 1, 2021 Audio-visual Question Answering Question Answering
Code Code Available 1MDETR - Modulated Detection for End-to-End Multi-Modal Understanding Jan 1, 2021 Phrase Grounding Question Answering
Code Code Available 2TRAR: Routing the Attention Spans in Transformer for Visual Question Answering Jan 1, 2021 Question Answering Referring Expression
Code Code Available 1Unsupervised Curriculum Domain Adaptation for No-Reference Video Quality Assessment Jan 1, 2021 Domain Adaptation Unsupervised Domain Adaptation
Code Code Available 1