Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning Mar 6, 2020 Density Estimation Noise Estimation
Code Code Available 05 Adaptively Clustering Neighbor Elements for Image-Text Generation Jan 5, 2023 Clustering Decoder
Code Code Available 05 Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations May 15, 2019 Image Captioning Question Answering
Code Code Available 05 No Images, No Problem: Retaining Knowledge in Continual VQA with Questions-Only Memory Feb 6, 2025 Continual Learning Question Answering
Code Code Available 05 Noise-Induced Barren Plateaus in Variational Quantum Algorithms Jul 28, 2020 Visual Question Answering (VQA)
Code Code Available 05 Neural Module Networks Nov 9, 2015 Visual Question Answering Visual Question Answering (VQA)
Code Code Available 05 NeSyCoCo: A Neuro-Symbolic Concept Composer for Compositional Generalization Dec 20, 2024 Compositional Generalization (AVG) Novel Concepts
Code Code Available 05 Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding Oct 4, 2018 Question Answering Representation Learning
Code Code Available 05 Composition Vision-Language Understanding via Segment and Depth Anything Model Jun 7, 2024 Question Answering Visual Question Answering (VQA)
Code Code Available 05 Navigating Cultural Chasms: Exploring and Unlocking the Cultural POV of Text-To-Image Models Oct 3, 2023 Image Generation Visual Question Answering (VQA)
Code Code Available 05 NAAQA: A Neural Architecture for Acoustic Question Answering Jun 11, 2021 Acoustic Question Answering Question Answering
Code Code Available 05 MUREL: Multimodal Relational Reasoning for Visual Question Answering Feb 25, 2019 Relational Reasoning Visual Question Answering
Code Code Available 05 Multi-Target Embodied Question Answering Apr 9, 2019 Embodied Question Answering Navigate
Code Code Available 05 MUTAN: Multimodal Tucker Fusion for Visual Question Answering May 18, 2017 Visual Question Answering Visual Question Answering (VQA)
Code Code Available 05 Multiscale Byte Language Models -- A Hierarchical Architecture for Causal Million-Length Sequence Modeling Feb 20, 2025 Decoder GPU
Code Code Available 05 Grad-CAM: Why did you say that? Nov 22, 2016 Image Captioning Visual Question Answering
Code Code Available 05 Multi-Sourced Compositional Generalization in Visual Question Answering May 29, 2025 Question Answering Visual Question Answering
Code Code Available 05 No-Reference Video Quality Assessment Based on Benford’s Law and Perceptual Features Nov 12, 2021 No-Reference Image Quality Assessment Video Quality Assessment
Code Code Available 05 GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models Aug 29, 2024 Bias Detection Fairness
Code Code Available 05 Compact Trilinear Interaction for Visual Question Answering Sep 26, 2019 Benchmarking Knowledge Distillation
Code Code Available 05 Multimodal Residual Learning for Visual QA Jun 5, 2016 Multiple-choice Question Answering
Code Code Available 05 CommVQA: Situating Visual Question Answering in Communicative Contexts Feb 22, 2024 Question Answering Visual Question Answering
Code Code Available 05 Multimodal Hypothetical Summary for Retrieval-based Multi-image Question Answering Dec 19, 2024 Contrastive Learning Language Modeling
Code Code Available 05 Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding Jun 6, 2016 Phrase Grounding Visual Grounding
Code Code Available 05 Multimodal Explanations: Justifying Decisions and Pointing to the Evidence Feb 15, 2018 Activity Recognition Explainable Models
Code Code Available 05 Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering Aug 4, 2017 Question Answering Visual Question Answering
Code Code Available 05 Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism Apr 29, 2024 document understanding GPU
Code Code Available 05 Multi-Image Visual Question Answering Dec 27, 2021 Question Answering Visual Question Answering
Code Code Available 05 COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes Sep 6, 2024 Multiple-choice Question Answering
Code Code Available 05 Adapting Lightweight Vision Language Models for Radiological Visual Question Answering Jun 17, 2025 Diagnostic Question Answering
Code Code Available 05 Generalizing Visual Question Answering from Synthetic to Human-Written Questions via a Chain of QA with a Large Language Model Jan 12, 2024 Language Modeling Language Modelling
Code Code Available 05 MQA: Answering the Question via Robotic Manipulation Mar 10, 2020 Imitation Learning Question Answering
Code Code Available 05 General Greedy De-bias Learning Dec 20, 2021 image-classification Image Classification
Code Code Available 05 Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory Jul 4, 2021 Question Answering Scene Understanding
Code Code Available 05 Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering Sep 23, 2020 Question Answering Visual Question Answering
Code Code Available 05 No-Reference Video Quality Assessment Using Space-Time Chips Aug 23, 2020 Video Quality Assessment Visual Question Answering (VQA)
Code Code Available 05 Modeling Relationships in Referential Expressions with Compositional Modular Networks Nov 30, 2016 Visual Question Answering (VQA)
Code Code Available 05 Ask Your Neurons: A Deep Learning Approach to Visual Question Answering May 9, 2016 Question Answering Visual Question Answering
Code Code Available 05 Compositionality as Lexical Symmetry Jan 30, 2022 Data Augmentation Inductive Bias
Code Code Available 05 Modularized Zero-shot VQA with Pre-trained Models May 27, 2023 object-detection Object Detection
Code Code Available 05 Multiview Contrastive Learning for Completely Blind Video Quality Assessment of User Generated Content Jul 13, 2022 Contrastive Learning Optical Flow Estimation
Code Code Available 05 GAMIVAL: Video Quality Prediction on Mobile Cloud Gaming Content May 3, 2023 Video Quality Assessment Visual Question Answering (VQA)
Code Code Available 05 Game of Sketches: Deep Recurrent Models of Pictionary-style Word Guessing Jan 29, 2018 Question Answering Visual Question Answering
Code Code Available 05 FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment Apr 12, 2025 Video Quality Assessment Visual Question Answering (VQA)
Code Code Available 05 Co-attending Regions and Detections with Multi-modal Multiplicative Embedding for VQA Nov 18, 2017 Form Question Answering
Code Code Available 05 Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering Nov 18, 2017 Form Visual Question Answering
Code Code Available 05 Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering Oct 26, 2022 Question Answering Visual Question Answering
Code Code Available 05 A Joint Sequence Fusion Model for Video Question Answering and Retrieval Aug 7, 2018 Decoder Multiple-choice
Code Code Available 05 Adaptive loose optimization for robust question answering May 6, 2023 Extractive Question-Answering Machine Reading Comprehension
Code Code Available 05 Modulating early visual processing by language Jul 2, 2017 Question Answering Visual Question Answering
Code Code Available 05