Neural Self Talk: Image Understanding via Continuous Questioning and Answering Dec 10, 2015 Question Answering Question Generation
— Unverified 0NeurIPS 2023 Competition: Privacy Preserving Federated Learning Document VQA Nov 6, 2024 Federated Learning Language Modelling
— Unverified 0Neuro-Symbolic Spatio-Temporal Reasoning Nov 28, 2022 AI Agent Image Segmentation
— Unverified 0Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" Jun 20, 2020 Graph Generation Question Answering
— Unverified 0Neuro-Symbolic VQA: A review from the perspective of AGI desiderata Apr 13, 2021 Question Answering Visual Question Answering
— Unverified 0New Ideas and Trends in Deep Multimodal Content Understanding: A Review Oct 16, 2020 Cross-Modal Retrieval Deep Learning
— Unverified 0NEWSKVQA: Knowledge-Aware News Video Question Answering Feb 8, 2022 Common Sense Reasoning Management
— Unverified 0NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning Jul 9, 2018 General Classification Machine Translation
— Unverified 0Non-monotonic Logical Reasoning Guiding Deep Learning for Explainable Visual Question Answering Sep 23, 2019 Inductive Learning Logical Reasoning
— Unverified 0KonVid-150k: A Dataset for No-Reference Video Quality Assessment of Videos in-the-Wild Dec 17, 2019 Transfer Learning Video Quality Assessment
— Unverified 0Normalized and Geometry-Aware Self-Attention Network for Image Captioning Mar 19, 2020 Image Captioning Machine Translation
— Unverified 0Not all Views are Created Equal: Analyzing Viewpoint Instabilities in Vision Foundation Models Dec 27, 2024 3D Reconstruction All
— Unverified 0NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding Apr 12, 2025 Benchmarking Document AI
— Unverified 0Not-So-CLEVR: Visual Relations Strain Feedforward Neural Networks Jan 1, 2018 Memorization Question Answering
— Unverified 0NTIRE 2023 Quality Assessment of Video Enhancement Challenge Jul 19, 2023 Deblurring Image Restoration
— Unverified 0NTIRE 2024 Quality Assessment of AI-Generated Content Challenge Apr 25, 2024 Image Quality Assessment Image Restoration
— Unverified 0Object-based reasoning in VQA Jan 29, 2018 Object object-detection
— Unverified 0Object-Centric Diagnosis of Visual Reasoning Dec 21, 2020 Diagnostic Object
— Unverified 0Off-Policy Evaluation for Human Feedback Oct 11, 2023 Off-policy evaluation Reinforcement Learning (RL)
— Unverified 0OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval May 10, 2025 Cross-Modal Retrieval Question Answering
— Unverified 0OmniCount: Multi-label Object Counting with Semantic-Geometric Priors Mar 8, 2024 Object Object Counting
— Unverified 0Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts Dec 1, 2023 Chart Question Answering Document AI
— Unverified 0OmniVL:One Foundation Model for Image-Language and Video-Language Tasks Sep 15, 2022 Action Classification Action Recognition
— Unverified 0On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization May 24, 2022 Descriptive Image Captioning
— Unverified 0One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering Nov 4, 2024 Continual Learning Question Answering
— Unverified 0On Incorporating Semantic Prior Knowlegde in Deep Learning Through Embedding-Space Constraints Sep 25, 2019 Data Augmentation Question Answering
— Unverified 0Multimodal Hypothetical Summary for Retrieval-based Multi-image Question Answering Dec 19, 2024 Contrastive Learning Language Modeling
Code Code Available 0Enhancing Vietnamese VQA through Curriculum Learning on Raw and Augmented Text Representations Mar 5, 2025 Question Answering Visual Question Answering
Code Code Available 0VQA with no questions-answers training Nov 20, 2018 Visual Question Answering (VQA)
Code Code Available 0Co-attending Regions and Detections with Multi-modal Multiplicative Embedding for VQA Nov 18, 2017 Form Question Answering
Code Code Available 0Speech-Based Visual Question Answering May 1, 2017 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Enhancing the AI2 Diagrams Dataset Using Rhetorical Structure Theory May 1, 2018 Question Answering Visual Question Answering (VQA)
Code Code Available 0Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering Aug 4, 2017 Question Answering Visual Question Answering
Code Code Available 0Multimodal Explanations: Justifying Decisions and Pointing to the Evidence Feb 15, 2018 Activity Recognition Explainable Models
Code Code Available 0Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation Jun 27, 2024 Continual Learning Question Answering
Code Code Available 0Multimodal Residual Learning for Visual QA Jun 5, 2016 Multiple-choice Question Answering
Code Code Available 0VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation Aug 15, 2017 Language Modeling Language Modelling
Code Code Available 0Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding Jun 6, 2016 Phrase Grounding Visual Grounding
Code Code Available 0Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation May 1, 2025 Question Answering Specificity
Code Code Available 0Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism Apr 29, 2024 document understanding GPU
Code Code Available 0Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering Sep 23, 2020 Question Answering Visual Question Answering
Code Code Available 0Multi-Image Visual Question Answering Dec 27, 2021 Question Answering Visual Question Answering
Code Code Available 0Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations May 15, 2019 Image Captioning Question Answering
Code Code Available 0End-to-end optimization of goal-driven and visually grounded dialogue systems Mar 15, 2017 Decoder Deep Reinforcement Learning
Code Code Available 0Multiscale Byte Language Models -- A Hierarchical Architecture for Causal Million-Length Sequence Modeling Feb 20, 2025 Decoder GPU
Code Code Available 0Multi-Sourced Compositional Generalization in Visual Question Answering May 29, 2025 Question Answering Visual Question Answering
Code Code Available 0Multi-Target Embodied Question Answering Apr 9, 2019 Embodied Question Answering Navigate
Code Code Available 0End-to-End Instance Segmentation with Recurrent Attention May 30, 2016 Autonomous Driving Image Captioning
Code Code Available 0Multiview Contrastive Learning for Completely Blind Video Quality Assessment of User Generated Content Jul 13, 2022 Contrastive Learning Optical Flow Estimation
Code Code Available 0MUREL: Multimodal Relational Reasoning for Visual Question Answering Feb 25, 2019 Relational Reasoning Visual Question Answering
Code Code Available 0