RL-CSDia: Representation Learning of Computer Science Diagrams Mar 10, 2021 Question Answering Representation Learning
— Unverified 00 R-LLaVA: Improving Med-VQA Understanding through Visual Region of Interest Oct 27, 2024 Medical Visual Question Answering Multiple-choice
— Unverified 00 RMLVQA: A Margin Loss Approach for Visual Question Answering With Language Biases Jan 1, 2023 Question Answering Visual Question Answering
— Unverified 00 RMT-BVQA: Recurrent Memory Transformer-based Blind Video Quality Assessment for Enhanced Video Content May 14, 2024 Contrastive Learning Video Enhancement
— Unverified 00 Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets May 21, 2025 Dataset Generation Descriptive
— Unverified 00 Robustness Analysis of Visual QA Models by Basic Questions Sep 14, 2017 Question Answering Visual Question Answering
— Unverified 00 Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru Mar 10, 2025 Autonomous Driving Question Answering
— Unverified 00 Robust Visual Question Answering: Datasets, Methods, and Future Challenges Jul 21, 2023 Question Answering Visual Question Answering
— Unverified 00 Robust Visual Reasoning via Language Guided Neural Module Networks Dec 1, 2021 Question Answering Referring Expression
— Unverified 00 RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data Oct 23, 2022 Image Captioning Image-text Retrieval
— Unverified 00 RSVQA: Visual Question Answering for Remote Sensing Data Mar 16, 2020 Land Cover Classification Object Counting
— Unverified 00 S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning Sep 5, 2023 Decision Making Visual Question Answering (VQA)
— Unverified 00 SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning Feb 18, 2025 Machine Unlearning Visual Question Answering (VQA)
— Unverified 00 SA-VQA: Structured Alignment of Visual and Semantic Representations for Visual Question Answering Jan 25, 2022 Question Answering Visual Question Answering
— Unverified 00 SB-VQA: A Stack-Based Video Quality Assessment Framework for Video Enhancement May 15, 2023 Video Enhancement Video Quality Assessment
— Unverified 00 Scaling Large Vision-Language Models for Enhanced Multimodal Comprehension In Biomedical Image Analysis Jan 26, 2025 Articles Hallucination
— Unverified 00 Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning Dec 1, 2021 Logical Reasoning Question Answering
— Unverified 00 Sce2DriveX: A Generalized MLLM Framework for Scene-to-Drive Learning Feb 19, 2025 Autonomous Driving Bench2Drive
— Unverified 00 SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering Dec 16, 2022 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 00 Scene Graph Generation with Geometric Context Nov 25, 2021 Activity Recognition Graph Generation
— Unverified 00 Scene Graph Reasoning for Visual Question Answering Jul 2, 2020 Navigate Question Answering
— Unverified 00 A Comprehensive Survey of Scene Graphs: Generation and Application Mar 17, 2021 Image Captioning Question Answering
— Unverified 00 Scene Understanding Enabled Semantic Communication with Open Channel Coding Jan 24, 2025 Question Answering Scene Understanding
— Unverified 00 Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Jun 12, 2025 Attribute Multimodal Reasoning
— Unverified 00 SC-ML: Self-supervised Counterfactual Metric Learning for Debiased Visual Question Answering Apr 4, 2023 counterfactual Metric Learning
— Unverified 00 Secure Video Quality Assessment Resisting Adversarial Attacks Oct 9, 2024 Adversarial Defense Video Quality Assessment
— Unverified 00 SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors Mar 18, 2024 Hallucination Motion Planning
— Unverified 00 Seeing and Reasoning with Confidence: Supercharging Multimodal LLMs with an Uncertainty-Aware Agentic Framework Mar 11, 2025 Conformal Prediction Multimodal Reasoning
— Unverified 00 Seeing is Knowing! Fact-based Visual Question Answering using Knowledge Graph Embeddings Dec 31, 2020 Common Sense Reasoning Knowledge Graph Embeddings
— Unverified 00 SegEQA: Video Segmentation Based Visual Attention for Embodied Question Answering Oct 1, 2019 Embodied Question Answering Question Answering
— Unverified 00 Segmentation-guided Attention for Visual Question Answering from Remote Sensing Images Jul 11, 2024 Question Answering Segmentation
— Unverified 00 Segmentation Guided Attention Networks for Visual Question Answering Jul 1, 2017 Common Sense Reasoning Question Answering
— Unverified 00 Select2Plan: Training-Free ICL-Based Planning through VQA and Memory Retrieval Nov 6, 2024 Autonomous Navigation In-Context Learning
— Unverified 00 Selectively Answering Visual Questions Jun 3, 2024 Avg In-Context Learning
— Unverified 00 Selective State Space Memory for Large Vision-Language Models Dec 13, 2024 Mamba Visual Question Answering (VQA)
— Unverified 00 SelfGraphVQA: A Self-Supervised Graph Neural Network for Scene-based Question Answering Oct 3, 2023 Graph Neural Network Question Answering
— Unverified 00 Self-Segregating and Coordinated-Segregating Transformer for Focused Deep Multi-Modular Network for Visual Question Answering Jun 25, 2020 Diversity Question Answering
— Unverified 00 WeaQA: Weak Supervision via Captions for Visual Question Answering Dec 4, 2020 Question Answering Visual Question Answering
— Unverified 00 Semantic Aligned Multi-modal Transformer for Vision-LanguageUnderstanding: A Preliminary Study on Visual QA Jun 1, 2021 Question Answering Visual Question Answering
— Unverified 00 Semantically-Aware Game Image Quality Assessment May 16, 2025 Feature Importance Image Quality Assessment
— Unverified 00 Semantic-aware Modular Capsule Routing for Visual Question Answering Jul 21, 2022 Question Answering Visual Question Answering
— Unverified 00 Semi-supervised Learning of Perceptual Video Quality by Generating Consistent Pairwise Pseudo-Ranks Nov 30, 2022 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 00 Sentence Attention Blocks for Answer Grounding Sep 20, 2023 Question Answering Sentence
— Unverified 00 Is the House Ready For Sleeptime? Generating and Evaluating Situational Queries for Embodied Question Answering May 8, 2024 2k Embodied Question Answering
— Unverified 00 Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures May 10, 2022 AutoML BIG-bench Machine Learning
— Unverified 00 Sheffield MultiMT: Using Object Posterior Predictions for Multimodal Machine Translation Sep 1, 2017 Image Captioning Image Classification
— Unverified 00 Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention May 15, 2021 Question Answering Visual Question Answering
— Unverified 00 Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making May 27, 2025 Decision Making Diagnostic
— Unverified 00 Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps Dec 9, 2020 Decoder Image Captioning
— Unverified 00 SimpleLLM4AD: An End-to-End Vision-Language Model with Graph Visual Question Answering for Autonomous Driving Jul 31, 2024 Autonomous Driving Language Modeling
— Unverified 00